CH4910 Notes: Detailed Table of Contents

Communication in Chemistry

1. Chemistry Newsgroups and Discussion Lists

2. Molecular Visualization Tools and Sites

3. Science Writing Aids

4. The Publication Process: Primary, Secondary, and Tertiary Sources

How and Where to Start

5. Guides to Chemical Information Sources and Databases

6. General Information on Computer Searching

7. Current Awareness, Reviews, and Document Delivery

8. Background Reading: Dictionaries, Encyclopedias, and Other Books

How and Where to Search: General

9. Searching by Author or Organization Names and by Known Citations

10. Searching by Subject

11. Searching by Chemical Name and Formula

12. Structure Searching

How and Where to Search: Specialized

13. Patents

14. Searching for Information Involving Chemical Measurements (Analytical/Constitutional Chemistry)

15. Searching for Chemical and Physical Properties of Substances

16. Searching for the Synthesis of Specific Compounds or Classes of Compounds (Reaction Chemistry)

17. Chemical Safety or Toxicology Information

Miscellaneous

18. Chemical History, Biography, Directories, Industry Sources

19. Teaching and Study of Chemistry; Careers in Chemistry

Updated: January 6, 2011. 

The information below was all adopted from material available online from Indiana University and rendered into a format for MTU.


Mondays, 5:05-5:55 p.m.
Lectures: CH101;

Instructor: Rudy Luck


Communication in Chemistry

Part 1: Chemistry Newsgroups and Discussion Lists

I. INTERNET LISTSERVES

A. CHARACTERISTICS OF LISTSERVE DISCUSSION LISTS:

B. The Original LISTSERV Program

C. Joining a LISTSERV list.

.

II. USENET NEWSGROUPS

A. CHARACTERISTICS OF NEWSGROUPS

OR

B. MAJOR NEWSGROUPS

Back to the table of contents

 

Part 2: Molecular Visualization Tools and Sites

I. HTML: HYPERTEXT MARKUP LANGUAGE

By now, most students are probably savvy users of the Web, but some may not know the nitty-gritty of how browsers really work within the confines of the Web. It is important to understand that there is chemistry helper and plugin software that extends the visualization capabilities of the standard Web browsers. Every personal and laboratory computer used by chemists, as well as the computers in science libraries, should be equipped with the CHIME plugin available from MDL and the free Adobe Acrobat Reader.

 

A. General Points About HTML

o        Absolute reference -- contains the complete address: host name, directory path, and file name

o        Relative reference -- assumes that the previous machine and directory path are being used: just the file name (or possibly a subdirectory and file name) is specified

o        URLs exist for WWW files, FTP, Gopher, UseNet, Telnet, etc.

o        General URL Format: protocolcode://server_address/path/filename

o        Examples:

http://chemistry.mtu.edu/
mailto:rluck@mtu.edu
news:sci.chem

B. Helpers and Plug-ins

 II. FILE FORMATS ON THE INTERNET

 

 

Format

Description

Helper/Plugin

 

.gif

CompuServe Graphics Interchange Format

browser

 

.jpg

JPEG (Joint Photographic Experts Group) graphics format

browser

 

.pdf

Adobe's Portable Document Format

Acrobat Reader

 

.tif

TIFF graphics format (Group IV fax)

 

 

.mid

MIDI music format file

 

 

.mpg

MPEG movie format

 

 

.mov

QuickTime movie format file

 

 

.wav

WAVE format audio file

 

 

and

many, many others

 

 

III. MOLECULAR FORMATS AND MIME

A. Chemical MIME (Multipurpose Internet Mail Extensions) - protocol to attach special files to electronic mail messages or embed them in HTML documents.

Chemical MIME Formats (chemical/x- )

Extension

MIME Subtype

Use

Helper/Plugin

kin

x-kinemage

Kinemage file for macromolecules

MAGE

pdb

x-pdb

Protein Data Bank format

Rasmol, Chime

jdx

x-jcamp-dx

Spectra format: infrared, NMR, Mass

JCAMP-DX

mol

x-mdl-molfile

MDL's Molecular File Format

Chime

chm

x-chemdraw

CambridgeSoft's ChemDraw format

CHEMDRAW

and

others

 

 

B. Other Formats

C. Mailers that understand MIME

D. Internet Sites for Further Information

Back to the table of contents

Part  3: Science Writing Aids

I. I. Introduction

That science has advanced so rapidly in the last few centuries is largely due to the major developments in communications and publication technology, starting with the invention of the printing press and movable type. The vast majority of the archival record of science is still very much in the format of the printed word, and paper-based forms of communication will remain a very important part of science. The records of science exist because scientists write about and publish the results of their experiments. As a scientist, you will be called upon to write many different types of compositions, ranging from laboratory notebooks to grant proposals, technical reports, and journal articles. There are books that attempt to teach you how to write better such as Fieser, Louis; Fieser, Mary. Style Guide for Chemists; Krieger: Huntington, NY, 1972; Schoenfeld, Robert. The Chemist's English, 3rd ed.; VCH Publishers: Deerfield Beach, FL, 1989;  White, Howard J. Reporting Experimental Data: Selected Reprints; ACS: Washington, 1993 and The ACS Style Guide: A Manual for Authors and Editors, 2nd ed.; ACS: Washington, 1997.

There are science writing courses or style manuals on the Internet that you can take advantage of (see the SIRCh link for some of them). In this section, we will become acquainted with other tools that can help make the process of researching and writing scientific documents a much easier chore.

 

Images can easily be inserted into modern wordprocessing programs such as Microsoft Word or WordPerfect. These programs come with spellchecker dictionaries, but unfortunately, the scientific vocabulary is quite limited in them. Scientific dictionary programs to supplement wordprocessing dictionaries can be found at ChemSW, Chemistry Software for Windows.

 

Graphing and data analysis programs make the task of visualizing data much simpler nowadays. These are designed to provide a combination of the common, frequently used features found in spreadsheet, visualization, and statistical software. One such package is KaleidaGraph from Synergy Software. Another is Origin. There are even scientific writing packages that handle mathematical expressions, such as MacKichan Software's Scientific Notebook.

 

A bewildering array of products could be considered tools to aid in writing, and many of those can be found among the products at ChemSW (formerly, WindowChem Software).

II. Chemical Drawing and Nomenclature Programs

Ideally, a chemical drawing program would integrate easily with wordprocessing software and would also give some assistance with the complex formal nomenclature system of chemistry. There is even a program, CLiDE that operates like Optical Character Recognition (OCR) software for chemistry. Clide can convert 2D representations of structures into MDL's mol file or CambridgeSoft's ChemDraw formats.

 

One of the most popular chemistry structure drawing programs is CambridgeSoft's ChemDraw.

ChemDraw Net's Opening Screen

The opening screen of ChemDraw is typical of such programs, with selections of pre-drawn chemical objects to choose from. A version of ChemDraw that serves as a WWW structure client is available free to educational users at the present time. Other producers of chemical drawing software also give academics free copies of their drawing products--either the full versions, as with MDL's ISIS/Draw, or freeware versions such as ACD's ChemSketch or Chemaxon's Marvin.

 

Several programs have been developed to take the image of the chemical structure one step further--to give it an acceptable IUPAC (International Union of Pure and Applied Chemistry) name. Thus, a program, such as AUTONOM will properly name close to 90 percent of the organic substances that are drawn with the program. ACD's Name program claims even higher accuracy. For other programs with this functionality, see the SIRCh supplementary page that corresponds to this lesson. Another page has a link to programs that convert chemical structures from one format to another or scan them into computer-readable formats

III. Personal Database Software Packages (Citation Managers)

Several of the chemical structure drawing packages discussed above are part of a package of programs that can manage bibliographic data and even physical and chemical data in a single database. Thus, ISIS/Draw and ISIS/Base complement one another. Likewise, CambridgeSoft has the ChemOffice suite for drawing, modeling, and information handling that includes ChemFinder. Others have developed products that work with existing relational database software, such as Accelrys's Accord for MS Excel or Access.

The more general personal bibliography manager software packages, such as ProCite, Reference Manager, and EndNote, while lacking in chemical capabilities, have other important features that make them very useful. For example, one can download records from a bibliographic database such as Science Citation Index and import the records directly into a personal database, without manipulating the data. Once loaded, the data can be re-used in the writing process. Such programs typically have a number of style sheets that allow the data in the records to be changed to suit the requirements of various publishers or authoritative style guides. One can mark a point in a manuscript and tie it to a reference in the bibliography that is automatically generated from the database. A recent feature in the personal database software is the capability to capture certain data from the Web, such as the TITLE and URL of a Web document and automatically load them into the database. Once a record has been formed for a Web page, the record can be recalled and automatically loaded through a Web browser. For additional information on the features of these three popular bibliography managers, see: Bibliographic Reference Managers.

IV. The ACS Style Guide

The journals published by the American Chemical Society are among the most important and highly cited scientific journals in the world. Likewise, books published by the ACS are also very respected. To assist both authors of papers and books and editors of ACS publications, the ACS has produced The ACS Style Guide: A Manual for authors and editors. There you will find instuctions and examples on the format required to cite all sorts of documents in a bibliography. In addition, the guide has a list of abbreviations for the most frequently cited journal titles. Other topics include grammar, style, usage, illustrations, tables, lists, and units of measure, as well as the conventions used in chemistry. It also covers numerous related topics, from peer review and copyrights to oral presentations and the ACS ethical guidelines for publication.

 

Back to the table of contents

 

Part 4: The Publication Process: Primary, Secondary, and Tertiary Sources

I. I. Types of Primary Literature.

PRIMARY LITERATURE refers to the first place a scientist will reveal to the general population in a publicly accessible document the results of scientific investigations. In many cases, the document that describes these results has undergone rigorous review by one or more peers who help insure the integrity of scientific knowledge. Increasingly, however, we are seeing appear on the Web PREPRINTS, unreviewed literature that is posted by the original author. More traditional primary publications include scientific journal articles, published conference proceedings, technical reports, dissertations or theses, and patents. All of these are collectively called DOCUMENTS.

 

Different types of information and different levels of detail are found in each of the DOCUMENT TYPES of primary literature. Since that is the case, it is sometimes important to distinguish the document type when a search is being conducted. Therefore, the type of primary document may be coded in a database or printed abstracting or indexing journal that covers more than one type of document in order to aid in retrieval.

 

Let's look at a few journal titles that one might expect to find in any respectable chemistry library. The American Chemical Society Committee on Professional Training's Library Guidelines for ACS Approved Programs includes a Journal List for Undergraduate Programs. On the list are news journals such as Science and Nature, primary research journals such as Inorganic Chemistry, and primary journals designed to rapidly communicate new research results such as Chemical Communications and Tetrahedron Letters. Also on the list are some secondary sources, such as Chemical Reviews

II. The Secondary Literature.

The terms "primary," "secondary," and "tertiary" literature are interpreted differently by different authors. We will call SECONDARY LITERATURE textbooks, treatises, monographs and "multigraphs" (books with multiple authors), encyclopedias and dictionaries, handbooks and data compilations, review articles and review serials, bibliographies, and indexing and abstracting services.

 

All of these secondary works have in common the goal of repackaging and better organizing the new information reported by researchers in the primary literature. Since there is additional work involved in creating the secondary works (that is, they gather their information and facts from the primary works), they are always less current than the primary literature.

III. The Temporal Relationship Between New Primary Literature and the Secondary Literature.

Depending on the type of effort expended in their compilation, the secondary works require varying periods of time to repackage or explicate new knowledge. Thus, there is a definite flow of scientific information from the inception of a research idea through the various types of secondary sources. One can typically expect to find the repackaged new knowledge or pointers to/summaries of new primary works in:

It is important to understand that the lag time is only partially linked to the frequency of updating of the secondary publication or database. An illustration of the time lag for journal articles entered in the Chemical Abstracts Service database, which is updated weekly, can be seen in the following abstracts:

  1. CA abstract 93:25540j appeared in the volume 93 no. 3 (July 21, 1980) issue of Chemical Abstracts, but the original journal article was in the v. 102 no. 7 (March 26, 1980) issue of the Journal of the American Chemical Society. Time lag: apparently 117 days.
  2. CA abstract 106:205033s appeared in CA issue v. 106 no. 24 (June 15, 1987), but in the v. 37 no. 1 (April 1987) issue of the Journal of Photochemistry. Time lag: perhaps 60-75 days.

Thus, the fact that an abstracting or indexing journal is updated every week or even daily does not necessarily mean that the primary literature covered in that update is the new primary literature that appeared that week. Quite the contrary, there is almost always some lag time between the appearance of the new primary literature and its coverage in the secondary sources. However, with some abstracting or indexing sources, notably ingenta (formerly, UnCover), the time lag is quite small, on the order of a few days.

A development that is reducing the time lag between the publication of new primary literature and its inclusion in A&I services is the electronic publishing of new journal articles long before the appearance of the print versions. The "As Soon As Publishable" (ASAP) policy of the American Chemical Society and similar early publication policies of other primary publishers (e.g., Springer Verlag) have tended to drastically reduce the lag time. Under ASAP, the articles published in ACS journals appear in the electronic versions of the primary journals some two to six weeks prior to the corresponding print title. The references to the articles are also fed into the Chemical Abstracts database (which is a part of the ACS) much earlier than those for the primary journals of other publishers.

 

Other categories of secondary works are directories, buyers' guides, biographical works, etc. These cannot be related easily to the primary literature in a temporal sense.

IV. Types of Computer-Readable Sources.

There are databases that correspond to the different primary and secondary printed sources. They can be categorized as:

Non-bibliographic databases are sometimes called FACTUAL or SOURCE databases, as opposed to bibliographic databases that traditionally give only pointers to primary publications that have facts in them.

 

The Internet, especially applications based on the World-Wide Web, is accelerating the creation of true full-text databases and blurring the distinction between abstracting/indexing databases and primary journals. Many scientific full-text databases on the Web now include graphics, such as the Web versions of the American Chemical Society journals. Both HTML versions and pdf versions of the articles are found in some databases. The HTML versions may have enhanced features such as links in the references of the bibliography of an article to ABSTRACTS (summaries) of the cited articles in an A&I database, with further links to the full-text Web version of the cited older articles. For example, it is now possible to link directly from the various options for searching the Chemical Abstracts database to over 1,000 primary journals through the ChemPort option. A project (CrossRef) is underway to provide direct links from the citations in an article directly to the cited article without having to visit an A&I service as an intermediate step.

 

Most primary chemistry journals are now available on the Web. See the list of WWW versions of primary journals accessible through the MTU Chemistry Library.

V. Options for Database Searching.

The options for database searching include:

VENDORS of online search services (for example, STN International) lease or acquire databases from the database PRODUCERS (such as Chemical Abstracts Service or the Institute for Scientific Information) and make them available on remote computers. For a given vendor, which may have dozens or hundreds of databases on its computers, the databases are all searched by a common command language or graphical user interface. In the vast majority of these cases, there is a fee for searching the databases.

Another option is to search free databases on the Internet. Usually, the quality of the databases that are freely accessible on the Internet is not as high as that of the commercial databases. In addition, there are many differences in the search interfaces that the user encounters among free Internet databases. Nevertheless, they should not be ignored for certain types of searches.

Local organizations now routinely load databases on their own computers. This includes the placement of CD-ROM products on local-area networks (LANs). Doing so often requires the searcher to utilize a number of different search systems, since each product typically comes with its own unique search interface. Several new models of providing databases searching are now being explored with the advent of client/server computer systems.

Back to the table of contents

How and Where to Start

Part 5: Guides to Chemical Information Sources and Databases

I. Introduction

Works that help you decide what secondary or primary tools to use or works that actually help you to use those tools are referred to as GUIDES, sometimes called TERTIARY tools. These may help:

Guides are found as printed books, directories of databases, directories of resources on the Internet, "how-to" manuals that accompany software or databases, and in several other formats, including online help files.

II. User Aids for Computer-Readable Databases

A particular type of guide is the DATABASE SUMMARY SHEETS provided by the VENDORS of online databases. Vendors are commercial entities that lease database content from DATABASE PRODUCERS and sell access to those files over public or restricted-access communication lines.

 

Name

Address

Phone

OpenText Corporation
(Livelink Discovery Server, formerly BRS/Search)

222 Third Street, Ste 3300
Cambridge, MA 02142

617-621-0820

Cambridge Scientific Abstracts

7200 Wisconsin Avenue
Bethesda, MD 20814

301-961-6750

CIS Chemical Information System
(NISC)

Wyman Towers, 3100 St. Paul Street
Baltimore, Maryland 21218

410-243-0797

Dialog Information, Inc.
(DIALOG and DataStar)

2440 El Camino Real
Mountain View, CA 94040

800-334-2564
415-254-7000

EBSCO Information Services

 

 

ingenta

44 Brattle Street, 4th Floor
Cambridge, MA 02138

1 800 296 2221 (within US only)
617 395 4000

NLM (National Library of Medicine)

8600 Rockville Pike
Bethesda, MD 20894

800-638-8480

Numerica/TDS

135 W. 50th St, Ste 1170
New York, NY 10020-1012

212-245-0044

Ovid Technologies, Inc.

333 Seventh Avenue
New York, NY 10001

800-950-2035
212-563-3006

Questel/Orbit, Inc.

8000 Westpark Drive
McLean, VA 22102

800-456-7248
703-442-0900

SilverPlatter Information, Inc. (ERL)
Now: Ovid

100 River Ridge Drive
Norwood, MA 02062

800-343-0064
617-769-2599

STN International

2540 Olentangy River Rd
P.O. Box 02228
Columbus, OH 43202

800-848-6533
614-447-3698

 

 

The vendor Dialog Information Services makes available their "bluesheet" database summary sheets as file 415 on the service itself and on the Internet. STN's database summary sheets are also available as Internet WWW files. Look at the database summary sheets for the LCA (Chemical Abstracts Learning File on STN) and LREG (Chemical Abstracts chemical dictionary learning file on STN). Note the different search and display possibilities that are possible with these files and how the summary sheets help you to select the right way to enter the search if you were using the native command language of the STN system. In C471, you won't have to know anything about the STN command language because the SciFinder Scholar product does much of that work for you behind the scenes.

 

It is important to note that the same database may be available on several different vendors' systems, sometimes with different components of the database or different time periods available. For example, STN International offers the most data for the Chemical Abstracts database in its CA, Registry, and other files. These provide the text of the abstracts, the drawings of the chemical structures, enhanced indexing, and the capability to use the chemical structure as a search key. The CA and Registry databases and others are searchable in SciFinder Scholar, although it may be transparent to you when you are in one database or the other. (Note that the Chemical Abstracts database on DIALOG is called CA SEARCH). Even a single vendor may offer multiple access points to certain databases, some designed for experienced searchers and others, for novices.

 

Designed to serve the chemical information needs of undergraduate students, CA Student Edition contains references from January 1, 1967 to the present from the Chemical Abstracts database. Included are references and abstracts from a select list of a few hundred journals and review serials, as well as over 200,000 chemistry dissertations. The list of journals covered in CA Student Edition can be seen at: http://www.cas.org/New1/selist.html. Searches may be performed by author names, subject words (including chemical substance names), and CAS Registry Numbers, as well as other search strategies. No structure searching is possible in the database.

III. Database Guides and Online Aids for Database Selection.

There are comprehensive printed directories of commercially available databases and an ever-increasing number of guides to free resources on the Internet. A large number of the latter have been collected at the Internet Public Library.

Another approach to selecting a database is to let the commercial vendor's search system analyze which databases among their offerings have information relevant to your search. Dialog Information's DIALINDEX identifies which DIALOG files have information on a given topic; INFODEX is an online index to the contents of more than 30 databases on the Chemical Information System. The corresponding type of search on STN would be done with the STNindex feature. The STNGuide file is a database of STN Summary Sheets. It can assist in selecting the proper database to search.

IV. Comprehensive Chemistry Guides

There is an increasing amount of chemical information available on the Internet. A powerful guide to chemical data, including structures, on the Web is ChemFinder from CambridgeSoft. In addition to linking chemical substances to Web pages on the Internet, ChemFinder is also a handbook with reliable chemical data for many compounds.

A searchable guide to both printed and computer-based reference tools is the Chemistry Reference Sources Database (CRSD).

V. Search Strategy Formulation

A SEARCH STRATEGY is a map of a course of action that ought to result in finding an answer to a chemical information problem using library, free Internet, and/or commercial database resources. It involves such tasks as:

  1. identifying the main concepts and other parameters for the search (time period, types of documents to be retrieved, other factors to be considered, e.g., immediacy of the need for the answer)
  2. drawing up a list of terms and other search keys to be used (e.g., chemical structures, authors' names, chemical names, etc.)
  3. deciding what sources are most likely to have the answers
  4. searching the printed works or databases until the answer is found or you are satisfied that no answer can be found in the available resources.

For the third step, the works in this session will help you make appropriate choices.

 

Back to the table of contents

 

Part 6: General Information on Computer Searching

I. Introduction

Database producers and database vendors make it possible to search files that are located outside our geographic area through the techniques of online database searching. The online database industry is now in its fourth decade, and many sophisticated search techniques have been developed during that period. In comparison, the search techniques found in Internet search engines might be considered rudimentary, but they are constantly improving.

 

Each vendor offers a range of databases, some of which are specific to a discipline (chemistry, physics, etc.). Others deal with mission-oriented problems such as energy or the environment, cutting across disciplines in their subject coverage. Once connected to a database vendor's system, it is possible to perform CROSS-DATABASE SEARCHES simultaneously in a number of related files (multi-file searching). There are also tools that libraries can purchase to perform "federated searching," the term currently used to describe searches using several databases at once. WebFeat is one such product to perform federated searches.

II. Computer-Readable Sources.

Recall that there are databases corresponding to the different primary and secondary printed sources:

In a sense, the Internet search engines have turned the entire Web into one giant database. However, it has been shown that no single Web search engine indexes everything on the Web. It is the usual case that only 1/3 or less of the publicly available pages are caught by any given search engine's robot that roams the Web looking for pages to index. And you should know that no robot makes the voyage around the Web to collect Web pages every day. It might take months for a robot to complete its journey through the desired Web pages. Thus, no search engine's results are ever totally up to date. That is true whether you try Google, Hotbot, Northern Light, Altavista, or any other search engine. Furthermore, the free Web search engines do not have access to library databases such as the Web OPACs that tell you the holdings of the libraries, nor can they access any of the commercial vendors' offerings. Nevertheless, the search engines are very powerful tools, and for certain types of questions, they can be very useful in a search for information. For example, many people, including chemists, maintain their own personal Web pages nowadays. For locating someone and perhaps finding a full or selective bibliography or a curriculum vitae (CV) of a chemist, the Web may offer the best route to reliable, up-to-date information. Likewise, very new or hot topics may be discussed in Web news groups or discussion lists long before they appear in traditional journals and, later, in abstracting and indexing services. For all of these reasons, we are beginning to see the commercial vendors add options to transfer the search strategy used in a commercial database search to the Internet for further information. One example is Elsevier Science Direct's Scirus, which searches both Elsevier journals and the Web. Another is STN's eScience. eScience requires you to first search a commercial STN database before taking your search to the Web.

In spite of the ease of accessing the Web, it ought to be a fairly rare case that you begin a subject search for information with a Web search engine if you have easy access online commercial databases in your organization. Databases such as the Web of Science (including Science Citation Index potentially all the way back to 1945), MDL's CrossFire (which, with the Gmelin and Beilstein databases, covers the literature of modern inorganic, organic, and organometallic chemistry back to their beginnings in the 18th and 19th centuries), and Chemical Abstracts (which covers all areas of chemistry in a comprehensive manner back to 1907) are usually much better first choices, if they are available to you.

III. Options for Database Searching.

The options for database searching include:

VENDORS of online search services (for example, STN International) lease or acquire databases from the database PRODUCERS (such as Chemical Abstracts Service or the Institute for Scientific Information) and make them available on remote computers. For a given vendor, which may have dozens or hundreds of databases on its computers, the databases are all searched by a common command language or graphical user interface. In the vast majority of these cases, there is a fee for searching the databases.

WEB SEARCH ENGINES

As noted above, the powerful search engines of today can provide a useful supplement to traditional online searches. A useful guide to search engines is maintained on the Search Engine Watch Web Site.

Some databases that are available for searching free on the Internet are of very high quality, for example, those produced by the National Library of Medicine or other government agencies or commercial organizations. However, the quality of most databases that are freely accessible on the Internet is likely to not be as high as that of commercial databases. In addition, there are many differences in the search interfaces that the user encounters among free Internet databases. Nevertheless, they should not be ignored for certain types of searches.

Chemical and pharmaceutical companies now routinely load databases on their own computers. This includes the placement of CD-ROM products on networks. CD-ROMs also often require the searcher to utilize a number of different search systems, since each product typically comes with its own unique search interface. Several new models of providing databases searching are now being explored with the advent of client/server computer systems.  At MTU several such databases are available.  These are on computers in the reference section.

IV. Costs and Benefits of Online Searching

The costs of a commercial online search are usually not fixed, but are dependent on several factors, including telecommunications network charges (even a connection via the Internet is not free on a commercial system), connect time on the vendor's computer, royalties charged for the information extracted from the database (known as HIT CHARGES), and on some systems, charges for the search terms input in the search strategy.

The benefits of using an online vendor to search databases include:

STN International is at present the only online vendor to have available the abstract data from Chemical Abstracts. The abstract's summary of the document provides a quick way to assess whether the document itself should be read for further information. See the examples of journal article and patent abstracts in the STN CA File Quick Reference Card. The card also shows examples of the Messenger search commands that must be used on the STN system when searching the CA database, with over 20,000,000 bibliographic records, in native command mode.

STN and Questel/ORBIT are the only vendors on which you can perform structure searches of the CAS Registry File. A subset of the CA database that is not used in this course is the CA Student Edition on the OCLC FirstSearch system.

V. Boolean Search Operators

Online search systems offer BOOLEAN SEARCH OPERATORS that show the logical relationship among different concepts. See "Operators for Relating Search Terms" for some examples of Boolean search operators on the STN system.

Pie, Cake, Ice Cream Venn Diagram

The most common Boolean operators are:

The normal use of the English word "or" implies a choice, with only one thing possible in the final selection. In a Boolean sense, OR really grabs all of the items and puts them into a set. A special variant of the OR operator is XOR. XOR retrieves a document only if one of the terms in the OR statement is present, but would skip any documents that have both terms.

Example: pie OR cake

If each of the pieces of pie and cake in a bakery were placed on its own plate and arranged on an enormous tray, we would satisfy the search (pie OR cake), and the tray would represent our answer set. Since the XOR operator was not used, there could even be some plates on which both pie and cake were found.

Example: cake AND ice cream

In this example, think of each of the pieces of cake as having to be on its own plate with some ice cream on top in order to satisfy the search.

Example: (cake AND ice cream) NOT chocolate

Example: (pie OR cake) NOT chocolate

Let's assume that you are allergic to chocolate. What would happen in the NOT examples if chocolate cake were the only type of cake available? In the first case, you would not get any dessert because the NOT completely eliminates the subset when one of the terms satisfies it. It throws out each of the plates containing the chocolate cake even if the ice cream is your favorite, vanilla. In the second NOT case, however, our search would allow us to have a piece of pie (as long as it wasn't chocolate pie or the plate didn't also have some chocolate cake on it!).

The NOT command must be used with caution in online searching since it could eliminate some documents that are of interest if they also happen to discuss aspects of a topic that are not of interest.

There are more specific variants of the AND command that can be used to define the spatial relationships of search terms. These are called POSITIONAL or PROXIMITY OPERATORS. On STN, they are:

STN assumes that multi-word phrases are to be searched using the (W) operator in the absence of explicit positional or other Boolean operators.

VI. Truncation (Masking) of Characters to Expand a Search

TRUNCATION is the search technique that allows the searching of more than one form of a word with a single command.

 

Truncation can occur at the left end or the right end of a word stem or within the word. STN now allows all three types of truncation in the CA File Basic Index, an index of subject words from the title words, words in the abstracts, or index terms (including Registry Numbers for compounds discussed in the documents). The limit of terms that can be gathered in a set by truncation is 30,000 stems. For left truncation the search term must have at least four characters.

Novice searchers and even professionals sometimes make gross errors with truncation, especially in systems that allow both left- and right-hand truncation. Think what would happen if a search were run with these character strings truncated on both sides:

?HEMI?
?ION?

Every occurrence of the word "chemical" or "chemistry" would be pulled in the first search, and every English word that ends in -ION would be pulled in the second case. Probably not what the searcher would have wanted!

VII. The CA, Registry, and Other CAS-Produced Files on STN: CAS Databases

Chemical Abstracts is the largest and most nearly comprehensive abstracting service for information in chemistry. It covers a very broad range of topics and has been published since 1907. At present Chemical Abstracts Service creates three main files and several related databases. These include the CA File of literature that extends back to 1907 and the CAOLD file that at present covers the period 1907-66. The Registry File contains searchable information that leads to the rapid identification of a compound, when a name, molecular structure, or other pertinent data is known about it. The Registry File also links these substances to the information that is indexed in the CA File and other chemical databases on the STN system through the Registry Numbers assigned by Chemical Abstracts Service to chemical substances. The CAS REGISTRY NUMBER is a unique number assigned to each chemical substance in the Registry File. For isatin, it is 91-56-5.

Also produced by CAS are the CASREACT file of organic reaction data, the CHEMCATS file that links chemical substances with commercial suppliers, the CHEMLIST file of regulatory data, and a special variant of the CA File, CAPlus, that offers rapid coverage of the articles in the main journals of chemistry.

The CA File covers chemical literature found in journals, patents, patent families, technical reports, books, conference proceedings, and dissertations from all areas of chemistry, biochemistry, chemical engineering, and related sciences from 1907 to the present. The CAplus file is a special version of the CA File that even has records for about 600 articles published before 1907. Since October, 1994 it contains all articles from more than 1,500 key chemical journals, including records for document types not covered in Chemical Abstracts (CA): biographical items, book reviews, editorials, errata, letters to the editor, news announcements, product reviews, meeting abstracts, and miscellaneous items. Bibliographic information and abstracts for the articles from the key chemical journals are added within one week of journal receipt. Both the CA and CAplus files were retrospectively converted to include earlier information. By the end of 2002, all CA bibliographic data that appeared in the printed Chemical Abstracts was included in the CA and CAplus files.

There are low-cost learning files that correspond to:

VIII. SciFinder and Other Front-end Software and WWW Access

Learning the command language of STN Interntional, DIALOG, or other vendors can be a significant barrier to online searching for some. There are programs that can help the novice searcher. One such FRONT-END program is STN Express with Discover. Questel/ORBIT's IMAGINATION software is another front-end software packages.

The most recent efforts by the major vendors to win online searchers have been directed toward the Internet. For example, STN EASY allows direct access to the STN databases with a relatively straightforward graphical user interface. Most recently, STN has developed for professional searchers STN on the Web. The U.S. National Library of Medicine's PubMed gives free and easy access to a version of the National Library of Medicine's main database, Medline.

Another STN product is SciFinder and its academic counterpart, Scifinder Scholar, which make the searching of some of STN's databases (CAPlus, Registry, CHEMLIST, CHEMCATS, and CASREACT) relatively effortless. It lets the user perform chemical searches by clicking on the icons depicted below.

SciFinder Scholar 2000 Explore 
Options

SciFinder Scholar removes the need to know the STN Messenger search commands. It even makes it unnecessary to know the proper use of Boolean operators in a subject (Research Topic) search or to know how to use truncation symbols. It employs sophisticated built-in intelligence to deduce the relationships the searcher desires among the various words and phrases. Nevertheless, many online search systems, including Internet search engines, require at least a passing knowledge of these techniques in order to use them effectively.

 

Back to the table of contents

 

Part 7: Current Awareness, Reviews, and Document Delivery

I. Introduction

With the tremendous volume of primary chemical literature appearing each year, chemists need ways to become aware of new critical items they should be reading. The service that provides such assistance is called CURRENT AWARENESS. (These services are now often referred to as "alerting services.") Current awareness services automatically search on a frequent basis (usually weekly) the most recent entries in a database according to a search profile (strategy) that has been developed. For a broader look at less recent literature (perhaps one to two years old or older), REVIEW ARTICLES are often sought. These may cover hundreds of articles or other documents on the topic of the review. Finally, once the appropriate primary documents have been found, it may be necessary to use the services of a DOCUMENT DELIVERY SYSTEM, since it is likely that not all of the items needed will be found in the local library.

II. Current Awareness

A. Commercial Options from Major A&I Services

ISI's Current Contents series is a weekly series of current awareness bulletins. These have author and subject indexes, and entries appear in the printed Current Contents shortly after the appearance of a covered primary journal issue. One advantage of these table-of-contents services is that more journals are included in them than are found in most libraries. A disadvantage, however, is that there may be a few weeks delay between the appearance of the primary journal issue and its entry into the secondary Current Contents issue.

Current Contents comes in the following printed and electronic science editions, with approximately 1000-1600 journal titles covered in each:

It is also possible to subscribe to a Web version of Current Contents, Current Contents Connect. With such a product, an interest profile consisting of subject words, authors' names, etc., can be run against each weekly update. CC is also available as a printed weekly booklet, on diskettes, CD-ROMs, or as FTP files. The CD-ROM versions actually have a rolling one year's worth of data. Output can be sent as a library service via e-mail if profiles are run from a central library location.

CC has the capability to output the references in a format that will feed into personal database software, such as EndNote or ProCite. An added feature is the option of getting conference proceedings that are published as books. (Many conference proceedings are published as regular or special issues of primary journals and, hence, would already be covered in the basic CCOD.)

Chemical Abstracts Service has a Table-of-Contents feature in SciFinder or SciFinder Scholar. Much of the bibliographic information that enters the CAS database is now received from the publishers in electronic format. With electronic versions of primary articles now appearing weeks or even months before the printed counterparts, it is important to be able to list those articles in the database when they become available.

 

B. UnCover REVEAL Service for Journal Article Current Awareness

One can receive e-mail tables of contents of up to 50 journal titles from the ingenta (formerly, CARL UnCover) REVEAL service (not a free service). The ingenta database is not limited to science journals, and this creates a greater likelihood of obtaining output that is not really wanted (known as FALSE DROPS).  The service also allows up to 25 author or topic searches to be run weekly against new articles that have been added to the UnCover database in a given week. Since no additional indexing beyond the title words is added to the entries, they are available in the UnCover database very soon after publication, usually within a week of publication of the primary journal issue. Consequently, UnCover is one of the most up-to-date current awareness services in existence at this time.

 

C. Internet Journal Table-of-Contents Lists

Many publishers and others now put lists of the tables of contents of journals on the Internet. These are usually free to the user

 

Elsevier's ContentsDirect is one example of a service offered by a major publisher to provide free tables of contents of their journals in advance of publication. With over 1400 journals currently available, there are sure to be some of interest to chemists. Elsevier also provides current awareness services on some hot topics such as the fullerene articles that appear in some of their journals.

 

ACS offers a similar service for its 30+ journals, ASAP Alerts and Table of Contents Alerts. ASAP stands for "As Soon As Publishable," so when one of the articles enters the database, you are notified immediately via e-mail that includes a link to the article. The ACS Table of Contents Alers is also an e-mail notification service, but it comes out only when the entire contents of a new issue is posted on the Web.

 

D. CA Selects/CA Selects Plus and Other Standard Interest Profiles

A STANDARD INTEREST PROFILE is a type of current awareness service that covers a topic of sufficiently general interest to make it profitable to spread the cost among a large number of subscribers to the printed product. The CA Selects/CA Selects Plus products are bi-weekly printed updates that contain the same abstracts found in the printed CA. American Chemical Society members who pay with their own funds receive a substantial discount. There are over 200 separate topics for which the CA Selects standard interest profiles are produced. CASelects Plus on the Web also has all of the topics available and offers many advantages over the printed counterparts, including a hyperlink to the new issue from e-mail notification and key-word searching of the issue.

Other database producers' products of this type can be found in the Index to Standard Interest Profiles in Science and Technology.

 

E. Custom SDI Service

Custom interest profiles (SDI or SELECTIVE DISSEMINATION OF INFORMATION) can be constructed to produce frequent computerized updates from the Chemical Abstracts or other databases. Since SDI is tailored to individual interests, the cost is high compared to other options. A profile can be constructed on most databases on search vendors' systems (for example, STN) with automatic updates sent to an e-mail address, if desired.

III. Reviews

All of the sources in the previous section are aimed at making you aware of the existence of new primary literature as soon as possible after its publication. Sometimes, particularly at the start of a large research project, it is desirable to take a broader look at a subject, perhaps in one- or two-year periods of time. Review articles (or chapters) are written by experts in a field to make it easy to survey a large body of literature on a topic. The reviewers sift out the best literature, write a brief summary of the significant findings of the works, and give full bibliograpic references to the primary works. Thus, in a large field, a secondary review article may include hundreds of references in a single review article.

Review articles are sometimes published as special articles in primary journals, sometimes in conference proceedings. Most reviews are published in serial works that look like books, but often have titles that are clues to the review nature of their contents, for example, Annual Review of Biochemistry or Progress in the Chemistry of Organic Natural Products. They may also be published in journals whose purpose is to publish reviews, such as Chemical Reviews, a publication of the American Chemical Society, or Mass Spectrometry Reviews.

On SciFinder Scholar, one of the refine options allows you to select the document type "review". Each article that the author considers a review is so indexed in the CAPlus database. Likewise, in the Web of Science, one can limit the output to review articles.

A new concept in providing reviews seeks to cut down the time lag between the appearance of the new primary literature and its inclusion in a review. It is the Faculty of 1000 service. It is a literature awareness tool that highlights and reviews the most interesting papers published in the biological sciences, based on the recommendations of a faculty of well over 1000 selected leading researchers. These scientists provide a consensus map of the important papers and trends across biology and the life sciences.

The Index to Scientific Reviews, produced by the same company that publishes the Science Citation Index and Current Contents, has covered reviews since the early 1970s. A good source of reviews in organic chemistry is the series of treatises published by Pergamon Press. For example, Comprehensive Organometallic Chemistry includes in v. 9 of the work is an "Index of Review Articles and Specialist Texts on Organometallic Chemistry."

A new service is BioMedNet Reviews, an archive of more than 5,000 full-text reviews from the series of journals entitled Trends in... and Current Opinion on....

IV. Document Delivery

DOCUMENT DELIVERY is a term used in libraries to refer to the process of acquiring a copy of an item which your home library does not own and does not intend to buy in the original. Thus, it could mean INTERLIBRARY LOAN, the process whereby copies of books are borrowed through your library from other libraries or copies of articles are obtained from them. It more and more refers to the purchase of individual copies of the items to be given to the end user (perhaps at no charge or only a partial charge to the end user).

As electronic journals become widely available, the boundaries between secondary abstracting and indexing services and the primary journals are obscured. For example, in 1998, the STN ChemPort service began to provide access to full-text journals of key scientific publishers through STN Easy, STN Express, STN on the Web, SciFinder, and SciFinder Scholar. There are direct links from search results in CAplus, MEDLINE, EMBASE, BIOSIS, INSPEC, and other secondary scientific databases to the corresponding electronic full texts of primary journal articles and other documents at the publishers' sites. Some of the publishers offer access to single articles on a per-article sales basis.

There are now hyperlinks from the citations in the original articles of some journals to CAplus records. Conversely, CAS has started to add citations to the CAPlus records, thus allowing links among the CA records in the database, and from there through ChemPort to the original literature. Since the American Chemical Society has announced plans to put all of the articles in its journals on the Web by the end of 2001, this development provides an attractive link to a major portion of the significant chemical primary literature.

Publishers are always concerned about violations of their copyright on the articles in their journals. Developments in document delivery that libraries have pioneered under the Fair Use clause of the current Copyright Act have always been viewed with mistrust by publishers. The electronic age has served to widen the divide between librarians and publishers, since the latter now seek to license content to libraries rather than to sell it outright.

 

Back to the table of contents

 

Part 8: Background Reading: Dictionaries, Encyclopedias, and Other Books

I. Introduction

Finding a dictionary, an encyclopedia, or even a textbook that covers a particular chemical subject or concept can quickly solve many chemical information questions. If more in-depth information is needed, one could turn to treatises or monographs written on the topic. For this type of background reading, a subject search of a LIBRARY OPAC (online public access catalog) database is often a good approach to identify an appropriate source.

In the Library of Congress subject headings that are commonly used in college and research libraries, the broad area of chemistry is broken down into sub-areas, which invert the names of the sub-disciplines:

Many library OPACs allow two methods of searching for a subject:

The latter approach utilizes a controlled vocabulary, the Library of Congress subject headings. Those may be searched in the IU Libraries OPAC, IUCAT. The broad LC subject headings can often be further defined by topic or format of the material being indexed, so one could search as subject words terms such as:

chemistry inorganic encyclopedias

OR

chemistry analytic dictionaries

to find appropriate works.

II. Dictionaries

Chemical dictionaries vary considerably in the type of information found in them, some being very close in content to a handbook of physical properties. Before we look at specific dictionaries, let's think about why we would go to a science dictionary in the first place. Quick access to essential facts about a topic, with the information arranged in an easy-to-use format--that is the usual reason for consulting a dictionary.

There are some dictionaries that cover many areas of science, for example, the Academic Press Dictionary of Science. This is typical of such dictionaries, with most of the content devoted to definitions of words or concepts arranged in alphabetical order. The work also contains special sections for such information as:

Another example of a general science dictionary is the McGraw-Hill Dictionary of Scientific and Technical Terms. A recent edition of this dictionary included a diskette with a file that can be loaded into MicroSoft Word so that the spelling checker in Word recognizes correctly-spelled scientific terms instead of flagging them as misspellings.

Below are some specific printed dictionaries for chemistry.

A number of smaller dictionaries that cover all of chemistry have appeared over the years, among them:

These are sometimes repackaged versions of the larger science dictionaries that have appeared from a given publisher.

The recent third edition of the Concise Encyclopedia Biochemistry and Molecular Biology is another translation of a German work. Although there are no color plates, the work has lots of illustrations. The Comprehensive Dictionary of Physical Chemistry, which appeared in 1992, was the first of a planned seven volumes that would cover all of modern chemistry.

Unfortunately, there are not a lot of freely accessible dictionaries on the Internet. An exception is the BioTech Life Science Dictionary, which has more than 8300 terms that deal mostly with biochemistry, biotechnology, botany, cell biology and genetics. The BioTech work also has some terms relating to ecology, pharmacology, toxicology and medicine. Another dictionary-type work is the Chemical Acronyms Database, with over 12,000 terms. There are also specialized printed acronym works, such as Beddoes' The Polymer Lexicon, a list of over 5,000 acronyms and abbreviations used in the rubber and plastics industries, and GABCOM & GABMET, subtitled, acronyms of compounds and methods in chemistry and physics.

As a final example of a specialized dictionary, consider Callaham's Russian-English Dictionary of Science and Technology. A dictionary of this type is particularly useful when faced with the prospect of reading an article in a foreign language. Again, a library's OPAC may help find a suitable dictionary with a subject search such as:

german language dictionaries english

There are many more dictionaries for chemistry than can be discussed here. To see more of them, perform a keyword search on the Chemical Reference Sources Database for: dictionar

III. Encyclopedias

Encyclopedias, like dictionaries, are tools designed to provide first and essential facts on a topic. However, the encyclopedia will make an effort to fit a topic into a general framework and to relate it to other concepts. Furthermore, an encyclopedia article will often provide a bibliography of key references on a topic so that further investigation in more depth is possible. General encyclopedias usually have a wealth of information on scientific topics, and we now see them becoming accessible on the Web. One such example is Britannica Online, the Web counterpart to the venerable Encyclopaedia Britannica. (Also found at the site is Merriam Webster's Collegiate Dictionary.)

An example of a search for information on mass spectrometry that was run on Britannica Online is below.

Encyclopaedia Britannica Search

Remember that, as a secondary work, information in an encyclopedia is invariably dated, even in an encyclopedia that is on the web. Such tools are not meant to keep up with the latest advances in science that have appeared this week. That is the chore of the abstracting and indexing services.

 

There are specialized general science encyclopedias, just as there are specialized science dictionaries. One of the best is the McGraw-Hill Encyclopedia of Science and Technology. A smaller, but very highly regarded encyclopedia is Van Nostrand's Scientific Encyclopedia. Both of these are now available in CD-ROM editions.

Specialized encyclopedias also exist for more specific areas of science, such as the Encyclopedia of Physical Science and Technology. Although this work keeps the traditional alphabetical arrangement of an encyclopedia, it contains a detailed subject index and a relational index, as well as a glossary.

 

In the field of chemistry, the MOST IMPORTANT ENCYCLOPEDIA is the Kirk-Othmer Encyclopedia of Chemical Technology. The massive set, which began to be published in its fourth edition in 1991, is now complete in 25 volumes. About one-half of the articles deal with chemical substances, and in those articles one can find a wealth of information on everything from physical properties to environmental concerns. But there are also many articles on such topics as industrial processes, uses made of chemical substances, pharmaceuticals, dyes, fibers, food, etc. Plainly stated, this is the single most important reference work for all areas of chemistry and all types of questions that you might encounter in your career. Note that this is not an encyclopedia that is updated all at once. As has been true of previous editions, the reader must be aware that the earlier volumes are more outdated than are those that have recently appeared. A good library will maintain all editions of the Kirk-Othmer Encyclopedia because it is frequently cited in the literature. The article on "mass spectrometry" in the fourth edition covers 24 pages and includes 79 references in the bibliography. In 2001, a Web version, Kirk-Othmer Online, became available.

 

Examples of more specialized encyclopedias in chemistry are the Encyclopedia of Inorganic Chemistry, with 260 main articles and over 860 short definitions, etc. in 8 volumes (also on CD-ROM), and the Encyclopedia of Analytical Science in 10 volumes. Following our example of "mass spectrometry" into this work, we find in the index volume six columns of entries. The main article on the topic covers nearly 250 pages of the work, divided into sections dealing with "Theory and Instrumentation," "Techniques," and "Applications." On the other hand, the 3-volume Encyclopedia of Spectroscopy and Spectrometry has only an 8-page article on the historical perspective of mass spectrometry, with an additional one-page article on applications of mass spectrometry to food science. Many other references to the topic are found scattered throughout the work if one consults the index. A new approach for this encyclopedia is that the purchaser of the set has the right to access it on the Web for a period of time, with an annual fee imposed after that. Another specialized encyclopedia is the Encyclopedia of Computational Chemistry in 5 volumes.

Thus, we see that, depending on the type of encyclopedia chosen, a great deal of information about a topic can be found very quickly.

IV. Other Books

The final type of materials we will consider in this section is books. It is sometimes difficult to draw the line between a textbook and a MONOGRAPH, a book written on a fairly narrow topic, increasingly by several authors, that is meant to be an authoritative work on the topic. Whereas a textbook clearly has been written to satisfy a teaching function, monographs may be used in upper-level chemistry courses or specialized seminars to cover a number of related topics for which no single textbook exists. This is especially true for new and emerging areas of study.

No such problems of definition exist for a TREATISE,. A treatise is always a multi-volume set of books that is intended to be an authoritative exposition of a topic. The treatise is designed to be used by experts in a field, and the authors presume a fair amount of knowledge of the discipline on the part of the readers. This is evident in the way treatises have the material arranged. Invariably, it is arranged according to the authors' or editors' conceptions of how the material should be most logically presented for the benefit of a reader who understands the discipline. Thus, we never find a treatise arranged in alphabetical order of the topics covered. In addition to the arrangement, a clue that a work is a treatise is often found in the title, which may have the word "comprehensive" as part of the main title.

Examples are:

A treatise usually appears over a number of years, and unfortunately for the novice, it is unlikely that an index covering the entire set will be published until the set is complete.

 

Back to the table of contents

How and Where to Search: General

Part 9: Searching by Author or Organization Name and by Known Citations

I. Introduction

Author searching, whether for individuals or corporations, is often easier than subject searching. Once the name is known, it is usually a matter of figuring out how the system you are searching labels the author field or corporate source field in order to limit the search to that index. Web systems will usually have a box labeled "author" which can be filled in.

The order of entry of a name, the punctuation, and on some databases, whether you must enter the name exactly as it is found in the file are some key points to learn before you attempt a search in an online database.   Show how MTU’s library works.

 

In terms of printed works, author indexes are found in even the very old literature of chemistry. It is usual for a publisher to create an author index at the end of a journal volume or publishing year to allow easy access to the articles published in the journal. Some even compile indexes that cover a decade or more of the journal's publication, and those are sure to include an author index. An example is the Royal Society of London's Decennial Index, 1971-1980, which is an index of authors in their Proceedings, Philosophical Transactions, and Biographical Memoirs publications. ChemicalAbstracts Service also published collective five- or ten-year author indexes covering all volumes of Chemical Abstracts from its inception in 1907.

 

A bibliography at the end of an encyclopedia article is a good source of the key names in a given subject area if you are beginning research in a new field. Author indexes are found in abstracting and indexing journals, in bibliographies, review serials, and in many other secondary works. In some instances, it may be worthwhile to look for a company as an author. It will depend on the database how (or even if) the corporate name is indexed. For very common personal names, it is sometimes useful to combine a personal author name search with the corporate name of the company for which the author worked at the time of publication.

II. The Science Citation Index, the Web of Science, and Related Products

A particular type of searching that is related to author searching by personal name is CITATION SEARCHING. In this case, the references to a known author's work, as they appear in the bibliographies of new literature, would be used to identify that new literature. In other words, a citation index is created to provide a link between an older cited work which you know is on a topic of interest and newer citing works. The assumption is that the more recent articles would have cited the older work only if they were on the same topic.

For many years, the Institute for Scientific Information (ISI) has published the Science Citation Index (SCI), with coverage in print format (and now also on the Web of Science) back to 1945. Printed SCI in its complete form is a multi-disciplinary index that covers the most important scientific and technical journals in the world (approximately 5,000 titles).

SCI covers the literature that was published from 1945 to the present and indexes it by author in the "Source Index" to SCI. Think of the Source Index as the author index for the literature that was new at the time the index was published. Since SCI includes the most important journals from all areas of science, it should be one of the first sources you turn to for a search for the publications of any scientist.

The really unique thing about SCI is that each volume also includes a "Citation Index" that in effect extends its coverage much farther back than 1945. Thus, even if an article were written in 1923, as long as someone had cited it in one of the journals covered by SCI after 1944, the bibliographic citation from the older article provides a link to the newer citing article(s).

There is also a subject index to SCI, which will be discussed in a later lecture.

The Cumulative editions of the printed Science Citation Index were published for the following years:

                                   Source      Permuterm     Citation                         Years     Index     Subject Index    Index                         1945-54      x                          x                         1955-64      x                          x                         1965-69      x            x             x                         1970-74      x            x             x                         1975-79      x            x             x                         1980-84      x            x             x                         1985-89      x            x             x                          etc.

All of the information from the printed Science Citation Index is now found in the Web of Science.

 

It is rare in most disciplines of science today to find an article written by a single scientist. Thus, an article may have 3, 5, 10, or even more authors listed on the publication. The record is far in excess of 100 authors on a single article! For obvious reasons, abstracting and indexing journals usually limit the number of authors' names on a given article that will be included in their pinted author indexes, and SCI is no exception. The Source Index covers a maximum of nine authors. Such limitations are beginning to disappear in the computer environment. ISI now includes all authors in their Web of Science version of SCI, and Chemical Abstracts Service, which had a limit of ten through 1996, raised the limit to 150 from 1997 onward.

 

The problem with multiple authors on individual articles is that only one can be listed first. Consequently, the SCI Citation Index will use the first author listed as the point of entry into the Citation Index, EVEN if the first author is not the most prominent scientist listed on the paper (the principal author). This is a reasonable approach, since most people who encounter the publication in a bibliography would see it cited exactly as it appears in the journal itself. However, think about the problem this causes when you want to find out how many people have cited ALL of the publications co-authored by a given scientist. If over the course of a career, there had been instances when the scientist was not listed as the first author, it would mean you would have to use each of those individual references as a separate search key to find all articles that had cited the scientist's works. This is a very tedious task in the print SCI and was not much easier to do in the database until recently. Nevertheless, it is a task that is often desirable to perform for purposes of supporting promotion and tenure cases, identifying young researchers in a particular area of research, etc.

 

The Web version of SCI appeared in 1997 (also with coverage back to 1945 for the new source material). It is called the Web of Science. This version of the Science Citation Index includes abstracts for many of the articles, and from 1997, e-mail addresses of the authors. One of the most powerful features of the Web version is the capability to find citations to most of an author's journal publications even if the author was not listed as the first author on the publication. [The articles must have been published in one of the more than 5,700 journals covered by the Web of Science version of Science Citation Index.]

 

Step 1: Enter a Cited Reference search form with the minimal information--author, journal abbreviation, and year--and perform a "Lookup" to see if the work has been cited by anyone.

 

Step 2: Look at the refences found by the search, paying special attention to variant forms that are obvious typing errors

 

Step 3: Choose newer article(s) of interest.

 

Step 4: Look at the full record in Web of Science, including the abstract.

 

Note the box in the upper right-hand corner above that is labeled "Related Records." These are records that have at least one cited reference in common with the document.

The online version of Science Citation Index is also available on both DIALOG and STN International where it is called SciSearch.  The Related Records feature is also found on STN's SciSearch, where it can be searched as far back as 1974. There are now CD-ROM versions of the product, even with abstracts, and re-packaging of the database has resulted in subsets such as the Chemistry Citation Index on CD-ROM.

It is now possible to enter a search on STN's SciSearch (or on ISI's Web of Science, as seen in the above example) and do a fairly thorough job of finding all of the publications covered by SCI which have cited a given author's publications. On STN, this is done using the SELECT CIT feature as a bridge from databases where comprehensive author searching is allowed. We could perform an author search for the publications of Ernest R. Davidson in STN's CA file and find everything published by him since 1967 in an answer set L4, for example. The search algorithm for the SmartSELECT feature on STN will extract the relevant search keys from answer set L4 and run the search in SciSearch when the following commands are entered:

=> FILE SCISEARCH
=> S L4 < CIT>
[There is no space between the left angle bracket and the "CIT". Some browsers show a space here.]

It is not possible to do this sort of comprehensive citation search for a single author with the CD-ROM versions of SCI. A search in the CD-ROM product for cited references is possible, but it will only take place on the author's name that is listed first on the publication itself.

Searches on the Corporate Source Index can be performed in SciSearch. For example on STN, the search statement:

=> S DOW FREEPORT/CS

will yield publications by the researchers at the Freeport location of The Dow Chemical Company.

A corporate search is also possible on the Web of Science, as shown below. A General Search includes an Address option where geographic place names and postal numbers, as well as words from the corporate name can be entered.

III. Author and Corporate Searches in the Printed Chemical Abstracts

It is possible to search the printed Chemical Abstracts (CA) all the way back to 1907, and there are author indexes for the entire period. In fact, searching for authors is made easy by the five- and ten-year cumulative indexes for Chemical Abstracts.

To effectively use the printed author indexes to CA, you must know that the alphabetization of the names takes into account only the first letters of the given names (first and middle names) EVEN THOUGH the full name is listed in the indexes. There are many other rules for determining where names fall in the author index of Chemical Abstracts, and you can refer to the work itself for those.

CA includes in its coverage not only scientific and technical journals (over twice as many journal titles as does Science Citation Index throughout most of the period since World War II), it also covers dissertations, conference proceedings, reports, patents, technical reports, and other primary literature. In 1995 Chemical Abstracts Service began to include entries for electronic journal articles in CA.

A special type of author entry found in CA is for a PATENTEE, the person who has applied for and received a patent. CA also indexes the PATENT ASSIGNEE, normally the company the patentee works for. Patentees are not found in the Source Index of Science Citation Index since that product covers only primary journals, but patents account for about 1/6 of the documents added to the CA database each year. In the printed CA indexes, the letter "P" is inserted between the volume number and abstract number in the author index to designate that a document is a patent, e.g.,

103:P160286w.

Corporate bodies are also indexed in the CA author indexes. Bear in mind that companies which include a personal name will have the name inverted in the printed author index, e.g., "Lilly, Eli, and Co."

IV. Author Searching in CAS Databases.

With the debut of SciFinder Scholar 2000, CAS introduced the "Company Name/Organization" search option. This is one of the main search options on the first screen of SciFinder Scholar, but it is also an option to refine a set of answers retrieved in some other manner of searching on the product.

The filing idiosyncrasies of the printed CA are usually not a problem in the STN or other versions of the CA database. With STN's SciFinder or SciFinder Scholar product, an algorithm finds likely candidates that match the search criteria (e.g., a misspelling of "Hieftje" as "Heiftje"). (However, it probably would not find a typing error such as, "Hleftje.")

SciFinder Scholar author choice

A few years ago, Chemical Abstracts Service introduced citation searching into the SciFinder and SciFinder Scholar product line. In 2003, it was possible to find new articles published from 1997 to the present by refining a search using the "Citing References" option. For example, suppose you wanted to know what articles published 1997 or later had cited Dr. Gary M. Hieftje's 1994 publication:

Wu, Min; Madrid, Yolanda; Auxier, Jake A.; Hieftje, Gary M.. New spray chamber for use in flow-injection plasma emission spectrometry. Analytica Chimica Acta (1994), 286(2), 155-67. CODEN: ACACAM ISSN:0003-2670. CAN 120:234885 AN 1994:234885 CAPLUS

When you view the full record for that entry, the "Get Related" option leads to this screen:

SciFinder Scholar Citation option

Choosing the "Citing References" option leads to these newer articles:

Records in CAPlus that cite Hieftje's 1994 article

V. Author and Citation Searching in Other Databases.

The Beilstein database covers the literature of organic chemistry back to the 18th century. Thus, it is a useful adjunct to the Chemical Abstracts and Science Citation Index databases. However, the file was not really designed for author searching, so one must be careful to include names that might be the desired author even if only the last name was entered into the database.

Certain patent databases utilize codes for company names (patent assignee codes). For example, Derwent's World Patent Index assigns a code to about 21,000 companies worldwide that have 50 patents or more. The parent company, subsidiaries, and related companies are retrieved. For Hoffmann-La Roche, the code is 39424.

NLM's PubMed, a version of the Medline database, includes "Related Articles." Although not quite the same as a true citation search, the effect is similar.

Back to the table of contents

Part 10. Searching by Subject


I. Introduction

Almost all abstracting and indexing services, not to mention many other secondary and primary works, have subject indexes. In this part we will look closely at the subject indexes for some of the major works already covered, as well as note the existence of specialized abstracting and indexing services devoted to a particular document type and full-text databases of primary and other literature types. Discussion of the type of subject search that uses the name of a specific chemical compound will be deferred to a later topic, although words that stand for classes of compounds will be discussed here.

 

The searches dealt with here are word searches. We must often find the right word(s) or group of words (phrases) to pull needed information from a given reference tool. Such searches cover techniques, processes, types of reactions, equipment, etc. The searcher has to be aware of variant spellings, the use of initialisms and acronyms, synonyms, and other complicating factors in such a subject search. In addition, the interpretation that the search system gives to the form in which the search statement is input is critical. For example, does the search system interpret two adjacent words as a phrase that must always have the words in that order? Or does it assume that either of those words could be present in a record in order to be a valid hit?

 

A fundamental question in conducting a subject search is whether all possible words, including synonyms, acronyms, abbreviations, etc. should be used in a subject search or whether the search can be conducted using a set of preferred terms selected by the indexers of the documents. As computers have become more and more powerful, the techniques of FULL-TEXT SEARCHING have become popular, with every word in a document being a potential subject term. Unfortunately, the number of false drops yielded in this type of UNCONTROLLED VOCABULARY search can be quite voluminous. Therefore, searching with terms selected from the CONTROLLED VOCABULARY of a THESAURUS or other subject term authority list is often preferable. An example is the MeSH Medical Subject Heading List that is used with the National Library of Medicine's Medline database. Another NLM effort that is even broader in its scope is the development of a Unified Medical Language System. Included in that project is the UMLS Metathesaurus. Chemical Abstracts Service uses the Index Guide to control search terms in the printed product, and the CA Lexicon on the STN system shows the underlying structure of the CAS vocabulary control system.

The distinction between uncontrolled (keyword) searching and searching using controlled vocabulary is important and is the main point of this lesson. That distinction is blurred in a tool like SciFinder Scholar. Most keyword searches, such as those in Science Citation Index, impose on the searcher the burden of selecting alternate names, acronyms, etc. for the concept of interest when performing the subject search. For example, Electron Spectroscopy for Chemical Analysis (ESCA) and X-Ray Photoelectron Spectroscopy (XPS) are both names for the same technique. Therefore, a search for all references to the technique in a keyword subject index would force the searcher to use both ESCA and XPS in the search strategy.

The relationship of the tool for controlling the vocabulary in Chemical Abstracts (CA Index Guide) to the printed subject indexes is presented below. Truncation symbols in use on various systems are also covered in this lesson, as are the capability to limit online searches in various ways (by date of publication, format, etc.) and to analyze results.

II. Truncation (Masking): The Use of Wild Cards

In many cases where subject searches are concerned, we are looking for topics that involve words built on a common root word, or that have some other variations that are easily signaled to a computer by means of a special symbol. TRUNCATION is the technique that tells the computer to form an answer set consisting of all records that contain words with the characters input for the search, but could also contain related words with suffixes (or, in some cases, prefixes or variable characters at a given point in the word).

On the STN system, truncation symbols are:

Symbol

Function

Example

exclamation point (!)

Exactly one character

cataly!e

hash mark (#)

One or no character

alcohol#

question mark (?)

Any number of characters

?therap?

As noted in the table, the # sign can be used at the end of a word to pick up both singular and plural forms of a word. Another way of accomplishing the same thing on STN using the command language option is to enter SET PLURALS ON at the system prompt. Both left- and right-hand truncations are allowed with the "?".

There are limits to the number of terms that can be gathered into a set using truncation. Therefore, caution must be exercised in using truncation to prevent too many search terms (or unexpected words) from entering the answer set.

There is no uniformity of symbols used to designate truncation among different vendors or search engines, although often we find an asterisk (*) used to indicate the right-hand truncation point. That is the case with the Web of Science, for example.

With SciFinder Scholar, no truncation is used. The searcher simply types into the Research Topic search window the natural language expression that defines the search, without even trying to insert Boolean search terms. The SciFinder Scholar search algorithm has some built-in intelligence to look for relevant word forms for the search. For instance, the search system automatically searches for both singular and plural subject words.

Let's see an example of a search on SciFinder Scholar for the analytical technique "Electron Spectroscopy for Chemical Analysis (ESCA)."

ESCA search on SciFinder Scholar

At the time it was run, the search as entered found 3474 references where the two concepts "electron spectroscopy" and "chemical analysis" were closely associated with each other and only 517 where the phrase as entered was found. In this case, let's repeat the search using the acronym for the analytical technique (ESCA) and also use a synonomous acronym, XPS. (The technique is also known as X-Ray Photoelectron Spectroscopy.) We have the option of entering synonomous words in parentheses, following a term or phrase. Thus, entering the research topic search on SciFinder Scholar as:

XPS (ESCA)

would imply to the system that you are looking for synonymous terms (an OR search). This search found considerably more documents: 81,559 at the time of the search.

III. Keyword Searches

Let us restrict the phrase KEYWORD SEARCH to the type of uncontrolled vocabulary searching that is done when the terms are not selected from an authoritative subject list. Keyword indexes are often computer-produced indexes that result in every significant word in the document (or in certain fields of the document) becoming a KEYWORD. Such indexes exist in the weekly printed issues of Chemical Abstracts and in the Science Citation Index in its Permuterm Subject Index. The same is true of the Web of Science subject searches and searches on CARL UnCover. However, Science Citation Index has for a number of years included the capability to enhance the keyword searches using their KeyWords Plus feature.

ISI utilizes words that authors sometimes provide in their articles that they feel best represent the content of the paper. These keywords are contained in the SCI record and are searchable. In addition, ISI generates KeyWords Plus for many articles. KeyWords Plus are words or phrases that frequently appear in the titles of an article's references, but do not necessarily appear in the title of the article itself. KeyWords Plus may be present for articles that have no author keywords, or may include important terms not listed among the title, abstract, or author keywords.

IV. Controlled Vocabulary Indexes

One of the virtues of a keyword subject index is that the index terms reflect the current, ever-changing vocabulary of science. As soon as a new name for a concept, technique, etc, is used in a document, it could become an indexing term. Controlled vocabulary lists, on the other hand, are slower to adapt to changes in scientific terminology, but their greatest benefit is that they guide you to the preferred term for the concept. Hence, the searcher need only identify the preferred indexing term to find documents of interest.

The printed tool that controls the vocabulary in the Chemical Abstracts six-month volume and five-year collective indexes is the INDEX GUIDE. For example, looking in the "E" section of the Index Guide for ESCA reveals the following:

ESCA (electron spectroscopy for chemical analysis)

See Photoelectric emission

x-ray

See Photoelectron spectroscopy

x-ray

Likewise, looking in the "X" section of the Index Guide for XPS leads to the same preferred phrases:

XPS (x-ray photoelectron spectroscopy)

See Photoelectric emission

x-ray

See Photoelectron spectroscopy

x-ray

Thus, the searcher would know that documents on this topic can be found in the "P" section of the General Subject Index to Chemical Abstracts. It is important to use the CA Index Guide before using the General Subject Index because there are no "see" references in the General Subject Index itself. Furthermore, each five-year collective index period has its own Index Guide. CAS has provided a guide to Hierarchies of General Subject Headings to assist in selecting terms.

V. The Basic Index

In regular database searches, the database vendors will usually define a default subject index in which words are searched. This is known as the BASIC INDEX. In the CA File, the Basic Index contains subject words from the titles, keywords, abstracts, and controlled vocabulary of the documents indexed by CAS.

VI. Chemical Abstracts Printed Subject Indexes and CA File Subject Searches vs. SciFinder Subject Searches

Prior to 1972, there were five- and ten-year Subject Indexes to Chemical Abstracts. Beginning with the 9th Collective Index period for 1972-76, the chemical name index entries for single chemical substances were put into a new work, the CHEMICAL SUBSTANCE INDEX. Everything else, including names for classes of substances (e.g., ethers), went into the GENERAL SUBJECT INDEX. Thus, searches for terms referring to classes of compounds, reactions, processes, equipment, or plant and animal species should be searched in the General Subject Index after the proper term or phrase has been found in the Index Guide. Another way of finding the proper General Subject Index terms for recent CA entries is to utilize the CA Lexicon on STN. The 14th Collective Index period refers to the years 1997-2001. You must keep in mind that the terminology rules may change from one collective index period to another. For example, the 14th CI Period moved significantly toward the current terminology in various fields, preferring DNA to the previous Deoxyribonucleic acids and Drugs to Pharmaceuticals. It is important to check the Index Guide that corresponds to the period you are searching in order to be sure of finding the correct term for use in the General Subject Index.

Not every preferred term or phrase is found in the Index Guide, and if you do not find a listing there, assume that you have chosen the correct preferred term and look in the appropriate section of the General Subject Index. Always be aware that preferred terms may change when the boundaries of the Collective Index periods are crossed.

The keywords used in the weekly issue indexing and single words from the titles, abstracts, controlled vocabulary terms, and so-called text modifications of the controlled vocabulary entries are all included in another CA file index, the BASIC INDEX. The text modifications were sometimes difficult to interpret, so beginning in October 1994, CAS introduced a format that is easier to read.

Old style:

Adsorbed substances

(carbon monoxide and water and nitric oxide, on copper-silica catalysts, reactions of)

New style:

Adsorbed substances

(adsorption and reactions of carbon monoxide and water and nitric oxide on copper-silica catalysts)

As noted above, the SciFinder Scholar topic search will do some behind-the-scenes work to find appropriate terms to include in a search, so people who use that search tool do not have to worry as much about controlled or uncontrolled vocabulary when they perform a research topic search. However, it is recommended that you use synonyms in parentheses next to a related concept, for example, ESCA (XPS).

VII. Section Codes for Online Searches

Since the information in Chemical Abstracts is classified into 80 major subject sections, the section numbers and codes can actually be used on STN with the CA Classification "CC" field in subject searches to assist in limiting a search. For example, works dealing primarily with enzymes are found in section 7 of the weekly Chemical Abstracts. Other documents are assigned to one of the 80 subject categories divided into the following gross categories:

Section
Name

Section
Code

Section
Numbers

Biochemistry

BIO/CC

1-20

Organic Chemistry

ORG/CC

21-34

Macromolecular Chemistry

MAC/CC

35-46

Applied Chemistry & Chemical Engineering

APP/CC

47-64

Physical, Inorganic, & Analytical Chemistry

PIA/CC

65-80

Thus, a strategy that included in an online search on STN:

=> S L4 AND 7/CC

or

=>S L4 AND BIO/CC

would have the effect of limiting the retrieved documents in answer set L4 to those dealing with enzymes (found in section 7 of the printed CA) or more broadly, those a biochemical nature found anywhere in section 1-20 of the printed product.

VIII. Formats: Document Types

In the printed Chemical Abstracts, a B or P immediately before an abstract number designates a book or a patent respectively. In the online CA file, these and other documents are found in the Document Type (DT) field of the CA File:

Code

Document Type

B

Book

C

Conference proceedings

D

Dissertation

GR

General review

J

Journal article

P

Patent

R

Review

T

Technical report

Thus, combining an answer set number with one or more codes or words can either limit the answer set to a particular document type (or perhaps eliminate an unwanted type), e.g.,

=> S L4 NOT P/DT

or

=> S L4 AND J/DT

Eight new document types (biography, book review, editorial, errata, letter, miscellaneous, news announcement, and product review) were introduced to the CAPlus file in 1994. SciFinder Scholar allows you to refine the answer set by many parameters, among them the document types shown below.

SciFinder Scholar Refine by Document Type

On the Web of Science, General Searches can be limited to:

and several other choices.

SciFinder Scholar searches can be refined by many other options, as seen below.

SciFinder Scholar Refine Options

Similar refinements are possible with Web of Science and other database searches.

IX. Specialized Abstracting and Indexing Services for Subjects or Document Types

There are many specialized abstracting or indexing services that cover either a subset of chemistry, e.g., Analytical Abstracts, or a particular format, e.g., Proquest's Dissertation Abstracts International and their Online Dissertation Services. Many of the techniques for subject searching discussed in this chapter are applicable to those works, but acquainting yourself with the guides, database summary sheets, and other user aids for any tools you choose to search is a very good idea.

X. Full Text Databases

Special techniques, particularly the use of proximity operators, are critical to success in searching text databases. Electronic primary journal databases are now widely available on the Web. American Chemical Society journals can be searched by subject on the Web only by words in the article titles or in the full text of the articles., More sophisticated searching is reserved to the Chemical Abstracts database and a link through CAS's ChemPort service to the articles themselves. The ACS Electronic Supporting Information (formerly called Supplementary Material), containing more detailed data and other supplements not found in the printed journals, is also available to subscribers of ACS journals on the ACS Publications Web site. Links to the Supporting Information can be found in the table of contents for those issues that include such data or linked to the HTML version of the articles themselves.

Elsevier Science makes available on the Web a search engine that covers both Elsevier journals and Web resources. It is Scirus, found at http://www.scirus.com

 

Back to the table of contents

 

Part 11: Chemical Name and Formula Searching (Searching for Information on a Single Chemical Substance)


I. Introduction

Chemical Abstracts Service's Registry File is the single largest collection of data that can be used to identify a chemical substance. Each unique chemcal substance is assigned a Registry Number, which CAS uses in preference to a chemical name to index documents in the CA or CAPlus Files. Much of the descriptive information about a compound (its molecular formula, variant names for the substance, as well as much detailed information about its makeup, including the structure) is found in the Registry File. Furthermore, in recent years, actual data (experimental or calculated data) have been added to the file, making it much more like a huge handbook. The Registry Number serves as the unique identifier of the record. The Registry File includes a number of search techniques that are built on the chemical name and other fields included in the Registry File records.

In the printed CA, there is no Registry Number Index. Instead, the Chemical Substance Index links the preferred CA Index Name for the substance to the documents that have information on it. However, names for classes of compounds are indexed in the General Subject Index. Also, in the printed Chemical Abstracts, supplemental access to the printed product is found in the Formula Indexes. The CSI has dictated much of the indexing policy for supplemental terms used to describe the role of the chemical substance in the document. The broad indexing terms found in the CAS Roles in the database and the Standard Subject Divisions in the printed CSI can be of considerable use in retrieving information about a compound on which much has been written.

Molecular formula searching in CA is dependent on the Hill Formula system. The concept of the dot-disconnected formula is important in both the database and the printed Molecular Formula Index to Chemical Abstracts.

A search for information on a single chemical substance may start with the name of the substance, its molecular formula, or various other words or codes that can be associated with it. (See: How to Search for CAS Registry Numbers in the CAS Registry File.) In this lesson, we will encounter various coding systems that have been applied to the retrieval of chemical substances from both printed and computer-based sources. The main database to search for such information is the CAS Registry File, which now has in excess of 40,000,000 records for chemical substances. Most of the entries in the Registry File now are for sequences of biological macromolecules. The bulk of the remaining small molecule entries are for organic compounds, either simple organics (esters, steroids, heterocycles, stereoisomers, etc.) or such things as mixtures, polymers, and organic salts. Just over 10% of the file is comprised of inorganic compounds.

II. Substance Searching Using Chemical Abstracts Service Registry Numbers

One very effective method of retrieving chemical substance information from a reference source is to utilize the Chemical Abstracts Service REGISTRY NUMBER for the substance. The Registry Number is a unique number assigned to each substance indexed by CAS. The CAS RN is a number of the format Y-XX-X, where Y can be from two to six digits, and X is one digit, for example, 494-12-2. The Registry Number is found in many databases and increasingly as an index to printed reference works. Bear in mind, however, that the Registry File started in 1965 with new substances that were encounered from that date forward. Most older substances are just now being being entered into the system for records that date from 1907-65. CAS is expected to finish this task soon, so there are not many compounds discovered post-1907 that are not in the CAS Registry System. For compounds discovered prior to 1907, it is wise to search the Beilstein and Gmelin databases, which have coverage back to the 18th century.

The Registry Number appears in the indexing of CA and CAPlus File records in preference to the formal name of the compound.

Registry Numbers in Indexing

The indexing above is part of that given by CAS to the record for:

Grieco, Paul A.; Bahsas, Ali. Reactions of allylstannanes with in situ generated immonium salts in protic solvent: a facile aminomethano destannylation process. J. Org. Chem. (1987), 52(7), 1378-80. CODEN: JOCEAH ISSN:0022-3263. CAN 106:195826 AN 1987:195826 CAPLUS

CAS Registry Numbers are assigned to organic and inorganic substances, metals, alloys, minerals, polymers, coordination compounds, elements, isotopes, peptides, enzymes, biomolecular sequences, and nuclear particles. However, the mere mention of a compound in a document is not enough to insure that the indexers at Chemical Abstracts Service will tie a CAS RN to the record for that document. To get an entry in the CA indexes, there must be something new reported about the substance. It may be a new method of preparation, a new source for the substance, a new reaction, a new kinetic or mechanistic study, new chemical or physical properties, a new method of analysis, a new use or application, or a new biological effect. Chemical reactants and the resulting products are routinely indexed, but reagents are not indexed unless there is a new prepartion of the reagent itself or a novel use of a standard reagent. This must have been the case for some of the compounds in the record above.

III. The Index Guide and Chemical Name Searching in the Printed Chemical Substance Indexes

Just as the Index Guide controls the vocabulary that must be used in the Chemical Abstracts General Subject Index, it also provides the correct name to use in searching the CA Chemical Substance Index. For example, a check of the Index Guide for "Flavan" finds the following:

Flavan
See 2H-1-Benzopyran, 3,4-dihydro-2-phenyl- [494-12-2]

In alphabetizing chemical substance names in the index, locant numbers, stereo designators, etc. are ignored. Thus, we must look in the "B" section of the printed CA Chemical Substance Index for "Benzopyran" in order to find index entries on the compound. Note that the CAS Index Name for Flavan is inverted, with the name of the so-called HEADING PARENT listed first. This keeps structurally related compounds in the same area of the index. The basic Heading Parent compound is listed first, followed by derivatives and other structurally related compounds. The entries in the Chemical Substance Index include the TEXT MODIFICATIONS (other subject words) that give more information about the documents that are indexed.

IV. Qualified Substances in CAS Files and Indexes

If not much has been written about the substance during the indexing period, all of the indexed information is found in a single alphabetical sequence under the Index Name in the printed Chemical Substance Index. However, when the index entries become voluminous, CAS divides them into Standard Subject Divisions. The compounds so treated are referred to as QUALIFIED SUBSTANCES. Originally seven qualifiers were used, but two additional terms (formation and processes) were added in 1994, and one phrase (uses and miscellaneous) was subsequently split apart. The qualifiers are:

V. CAS Roles in the CA and other Files

ROLES are CAS indexing terms assigned to every indexed substance and to controlled index terms for classes of compounds. The use of roles began to be applied to the new online CA File records with v. 121 (July 1994). They were then applied retrospectively to all CA File records by means of a computer algorithm. Since there are 38 specific roles and 7 broad super roles, they substantially expand the indexing terms that were used prior to their introduction. The role terms give a more precise link to the substance. For example, it is now possible to specify not only that you want the preparation of the substance, but also that the preparation be a synthetic preparation, as opposed to industrial manufacture. In the past, there was no distinction made in the use of the term "Preparation" in such cases.

VI. Searching the Registry File with a Chemical Name

The Registry File is the largest single source of chemical names in existence. It can be searched by a trade or common name for a substance (CN), by its CAS Index Name (CN) or by fragments of the CAS Index Name (CNS field).  Just as we had a Basic Index that is formed from subject words in a bibliographic database, there is also a BASIC INDEX for the Registry File when searched on STN. The Basic Index of the Registry File includes both chemical name fragments and molecular formula fragments. It may be necessary to follow certain protocols for special characters in order to search for a chemical name. Greek characters, for example, are spelled out in their entirety with a period before and after the Greek part of the name. An example of such a chemical name search in SciFinder Scholar is below. Note that in the SciFinder Scholar system, the search will work with or without the periods around the "alpha," but in STN command-language searching, the dots are mandatory.

alpha-Methylbenzoin Name Search

Methylbenzoin Registry File Record

VII. Searching the Registry File and Printed CA Indexes with a Molecular Formula

The system which is most commonly used today for arranging molecular formulas in indexes is the HILL SYSTEM. The Hill System covers both organic and inorganic compounds according to the following rules:

  1. Sum individually all like atoms within the molecule.
  2. If carbon is present, place it and the total number of C's first in the formula.
  3. If both carbon and hydrogen are present, place hydrogen and the total number of H's second. Note that if carbon is not present, rule 4 applies to the substance, and the H is placed in its regular position in the alphabet.
  4. All other atoms in the molecule are arranged alphabetically.
    That means that for inorganic substances without carbon, the arrangement is alphabetical.

Within the index itself, the numbers of elements come into play. Here is an example of compounds arranged for a Hill System Index:

 

 

Al6 Ca5 O14

C5 H8 O2

B2 O3

 

C8 H5 N O2

 

B2 Zr3

C15 H24 N2

Br H

C22 H24 F N3 O2

C Cl4

Ca O3 Ti

C H Cl3

Cl H

C H N O

H2 O4 S

C2 Ca

H4 Sn

C2 H4

O3 Pb Rb2

C2 H4 Br Cl

O5 P14 Zn7

C2 H5 Al Br2

Sn Zr4

Note that in the Registry File (including the SciFinder Scholar approach), the formulas may be searched with or without spaces between the element symbols. They are put here for clarity. The Hill System gives rise to some formulas that are quite different from those a chemist is used to seeing, e.g., H2O4S for sulfuric acid or BrH for hydrobromic acid.

The printed CA Formula Indexes do not have entries for the 600 or so qualified substances that have lots of information written about them. Thus, we find in the CA Formula Index from the 10th Collective Index period (1977-81):

 

C8H5NO2
1H-Indole-2,3-dione [91-56-5].

See Chemical Substance Index

sodium salt [3486-31-5], 90: 6180p; 91: 157670v; 94: 209034z

This tells us that the printed Chemical Substance Index must be used for detailed information on isatin itself, but it gives direct information that three documents dealt with the sodium salt of isatin during the period. When a sustance would have more than 20 entries in a 6-month volume index or more than 50 entries in the 5-year collective Formula Indexes, a "See" reference is made to the name of the substance in the Chemical Substance Index. We find in the Formula Index the abstract numbers for the sodium salt of isatin since there were relatively few documents written about that compound during the 10th Collective Index period.

A chemical formula in the Hill System may have more than one substance with that formula. For a given formula, isomers are arranged alphabetically by the CAS Index Name.

In the online molecular formula index of the Registry File (/MF), salts, addition compounds, and mixtures have the molecular formulas for the components arranged separately, with ratios for salts and addition compounds specified when known. If the ratios are unknown, a lower case "x" before the second formula or subsequent formulas is used, e.g.,

C15 H24 N2 . 2 Cl H

C22 H24 F N3 O2 .x H2 O4 S

These are examples of the so-called DOT-DISCONNECTED FORMULAS.

VIII. Molecular Formulas of Types of Compounds in CA/STN

A. Salts.

Simple salts such as sodium chloride are treated as any other Hill Formula: ClNa.

1. Metal Salts of Complex Organic or Organometallic Acids

In general these substances have the molecular formula of the cation followed by the dot disconnect symbol (the period) and a multiplier times the molecular formula of the anion.

For metal salts of organic acids, the metal replaces one or more hydrogens attached to N, O, P, As, Se, or Te in an organic substance. The CAS structuring conventions treat these substances in the following manner:

The multiplier for the organic acid is always 1. For the metal, it indicates the oxidation state as a fraction, e.g.,
C7 H6 O2 . 1/2 Cu

Example: C6 H8 O7 . 3 Na
1, 2, 3-Propanetricarboxylic acid, 2-hydroxy-, trisodium salt
CAS RN: 68-04-2

A search of the SciFinder Scholar product for the molecular formula yielded ten answers at the time of the search, among them:

68-04-2 Registry File record

Other examples:

Exceptions:

Organometallic compounds in the Registry File are substances which have a carbon atom directly bonded to a metal atom, e.g., Phenyl Lithium:
C6 H5 Li. Note, however, that carbonium ions and carbanions are generally found as dot-disconnects in the Registry File.

Coordination compounds in the Registry File are substances in which an atom or group of atoms is bound to a central metal atom by a pair of electrons supplied by the coordinate group and not by the central metal atom, e.g., metallocenes. These substances have the Class Identifier code CCS in the Registry File records.

B. Polymers.

Polymers are indicated with the molecular formula of the repeating unit(s) in parentheses to which is appended an "x". The "x" indicates a repeating unit. For example, the molecular formula for 1,3-Butadiene is (C4H6)x. A search for a polymer by molecular formula may retrieve variant forms of the substance, because the syndiotactic, isotactic, graft or co-polymer will all have separate Registry Numbers.

IX. The Basic Index in the Registry File

The Registry File's Basic Index contains chemical name fragments and molecular formula fragments (including molecular formulas for individual components of multi-component substances and single component substances). Formula fragments searched in the Basic Index must be entered without spaces.

X. Element Information

In command-driven searching, it is possible to search for various information about the elements comprising a chemical substance, such as:

XI. Ring System Data and Ring Indexes

The Ring Identifier information (RID) lets you search a database for everything from the number of rings in a substance to the Ring Formula (minus hydrogens). The Registry File now has much information about rings that can be searched online, such as the Elemental Sequence for the Smallest Ring (/ESS), the number of rings in the ring system (/NRRS), etc. These search techniques can be valuable in refining a substance search in the Registry File. See the Registry File Database Summary Sheet for more options.

The Ring Systems Handbook provides an easy way to find the Heading Parent name for ring compounds. This name can then be used in the printed Chemical Substance Index or, for an online search, either the name or the Registry Number can be used to retrieve the Registry File record. It is important to know that the compound found in the Ring Systems Handbook may not actually exist. That is, there may be no information in the CA File on the substance. When a new ring system is identified, the substituents are stripped off, and a new ring system entry placed in the RSH.

The access to the entries in the Ring Systems Handbook is by name or ring analysis (and then by molecular formula of the rings making up the compound, ignoring hydrogens). The main part of the set is arranged by the number of rings comprising the compounds and the individual sizes of the smallest set of smallest rings. Thus, the number of component rings, the sizes of those rings, and the elements comprising them are enough information to find a ring compound. A section in the main body of the work might be labeled:


2 RINGS: 5,6
C4N-C6


We would find in the section an entry for 1H-Indole [120-72-9]

                         H               C         .             :   .     . N .           C:      .C.      . C           .        :         :           .        :         :           C:       C.........C             :    .                           :C.             

with the molecular formula C8H7N and a 2-dimensional structural drawing of the molecule.

It would not be too difficult then to guess the proper Chemical Abstracts Index name for isatin: 1H-Indole-2,3-dione

isatin

Chemical Abstracts incudes an Index of Ring Systems with each Formula Index, beginning with the 7th Collective Index period (1962-66).

XII. Compound Class Identifiers.

There are a number of other indexes that can be used in an online search of the Registry File, e.g., Compound Class Identifiers (/CI).

Class Name

Code

Alloy

AYS

Coordination Compound

CCS

Registered Concept

CTS

Generic Registration

GRS

Incompletely Defined Substance

IDS

Manually Registered Substance

MAN

Mineral

MNS

Mixture

MXS

Polymer

PMS

Radical Ion

RIS

Ring Parent

RPS

An example of the use of the CI field in command-level searching is:

=> SEARCH PMS/CI (retrieves polymers)

Such searches are of use in combination with other Registry File searches in order to narrow an answer set. See the Registry File Summary Sheet for additional possibilities.

XIII. The CAOLD File

The CAOLD File contains records for documents indexed in Chemical Abstracts 1907-66. It is possible to search the CAOLD file with the CAS Registry Number. The records for items in the CAOLD file bear little resemblance to those in the CA file, providing merely a link to the printed Chemical Abstracts accession numbers or a mechanism with which to link to a pdf file of the page. It is important to know that the CAOLD file records were generated from the CA Formula Indexes. Since the qualified substances do not have Formula Index entries, there are many CA accession numbers in the period 1907-66 that do not have pointers from the CAOLD file. It is always best to double-check the results of a CAOLD file search against the printed Collective Index. See the STN Database Summary Sheet for the CAOLD File for additional information.

XIV. Other Online Chemical Dictionary Files

Databases such as the Registry File are referred to as ONLINE CHEMICAL DICTIONARY FILES. They exist to help you identify substances, to gather like substances into a set, and to discover which files on the database vendor's system have information on the substance(s).

Of particular interest are the online chemical dictionary files from the National Library of Medicine. Although not nearly as large as the Registry File, NLM's CHEMLINE file contained over 1,360,000 records as of mid-1995. The CAS Registry Number is part of each record. Searching by CAS RN's, molecular formulas, CAS Index Names, synonyms, various name and structural fragments is possible. A smaller NLM file is ChemIDplus, with nearly 350,000 compounds. An important feature of the ChemIDplus file is SUPERLIST. SUPERLIST designates a collection of lists of chemical substances maintained by key federal and state government regulatory agencies, as well as by scientific organizations concerned with health and environmental hazards of chemical substances. ChemIDplus provides directory assistance to those lists. Searching the NLM files is considerably cheaper than searching the CAS Registry file.

Unlike CAS, the National Library of Medicine has attempted to group compounds with related substances in their index in a hierarchical fashion. From 1963 through 1995, a chemical was generally "treed" in two places: in one Tree showing its chemical structure and in a second Tree under its function, or pharmacological action. The arrangement of chemical headings in MeSH (Medical Subject Headings) has not changed, but NLM no longer puts all drugs under the functional trees.

 

Back to the table of contents

 

Part 12: Structure Searching


I. What is Structure Searching?

STRUCTURE SEARCHING utilizes a graphic depiction of the chemical structure as input for a search. Such searches are generally run against the data in online chemical dictionary files, such as STN's Registry File. Depending on the type of structure search allowed by the system, the complete molecule or any compound containing the structure of the molecule will be retrieved as an answer set. Unlimited substitution of the input molecule may be allowed at free sites on the molecule (a FULL SUBSTRUCTURE SEARCH) or substitution may be limited to certain sites (a CLOSED SUBSTRUCTURE SEARCH). On the STN system, once an answer set is formed in the Registry File, it can be crossed over to the CA or other files to conduct further subject searches of the compounds thus isolated in a structure search. In these cases, it is actually the CAS Registry Number for the compounds that is being searched in the crossover files. Note that it is now possible to conduct a search that takes into account the stereochemistry of the chiral centers and double bonds. Stereo searching can be performed in the Registry File and the Beilstein File on STN or on the Beilstein CrossFire system. A

SIMILARITY SEARCH finds target molecules that are like the query structure in some respects. That might be some biological property such as drug absorption or toxicology, with respect to metabolism. Usually, it is the similarity in functional groups that is measured. Finally, MARKUSH STRUCTURE SEARCHING, an important technique in patent searches that allows for considerable variablility in the structures retrieved, is another option in some files. 

II. Why Use Structure Searching?

There are many reasons to do a substructure search, among them:

In combination with other types of searches, structure searching is a very powerful supplement.

III. Structure Searches in the STN Registry and Other Files.

Over 50,000,000 registered chemical substances appear in the Chemical Abstracts Service Registry File. Most of those have been registered since 1965, but, of course, not all of the compounds in the Registry File were discovered since that date. In 2002, Chemical Abstracts Service embarked on a project to retrospectively index all documents in the CA database. Thus, many compounds that have had no new information published about them since the establishment of the CA or CAPlus Files (i.e., since 1967) have now been added to the Registry File.

Most of the millions of compounds in the Registry File have their Registry Numbers linked to the databases on the STN system. The LC (File Locater) field of a Registry File record tells in which databases on STN the Registry Number is found. In addition to the Registry File, structure searches can be conducted in such databases on STN as BEILSTEIN, CASREACT, and others. A similar file locater function is included in other chemical dictionary files, such as NLM's ChemIDplus.

There are several types of structure searches possible in the Registry File, as well as different options for views of the molecules and different methods of inputting the structure. SciFinder Scholar masks to a certain extent the relationship between the Registry File and the CAPlus File, CASREACT, and other databases intertwined with its software.

SciFinder Scholar Structure Window

Within the SciFinder Scholar search stage itself, considerable information can be gleaned about the answer set to be retrieved. In the Preview option, the projected answer set can be analyzed by atom attachments, or, if the drawn structure contains them, by system-defined or user-defined variable groups. Once the structure is built and the answer set is retrieved, the search proceeds as it would if the compounds had been identified by name or molecular formula searches.

The structure search can be further refined with additional structural features or by limiting it to commercially available substances. Once refined, the references can be retrieved that have the Registry Number of the compounds in their indexing.

With a suitable viewer, the image of the molecule can be viewed as a 3-D model.

Isatin viewed with WebLab Viewer Lite

In traditional, command-driven structure searching, when logging on to STN, the choice of terminal determines what type of view of the molecule you will see. If one selects option 3 at the prompt:

TERMINAL (Enter 1, 2, 3 OR ?)

the structural depictions will be encoded with regular punctuation symbols found on a computer keyboard. Thus a double bond might be indicated by a colon (:) or an equal sign (=). With the proper telecommunications software, selecting option 2 will depict the structures as true graphical representations. That is the default option when using STN Express with Discover! (front-end software that allows the building of the structures offline).

The following types of structure searches are possible on STN:

With SciFinder Scholar, one of two options is available, depending on whether the Substructure Search Module is included in the version of the software. The basic SciFinder Scholar search covers an exact and family search. The SSS module allows the fuller search options.

There are actually several stages of a Registry File structure search. The first stage involves a screening of the huge file for compounds that have the requisite substitutents and other features, without regard to their position on the molecule. The much more computer-intensive iteration stage involves an atom-by-atom, bond-by-bond look at the candidate molecules isolated in the screen search. Since this stage requires so much of STN's computer resources, there are limits on the number of compounds that can be looked at during the iterative stage. A sample search must be run on approximately 5% of the file, after which a prediction as to whether the full file search will run to completion is given. Assuming the prediction is favorable, the full file search can be compared to the structure. Otherwise, the structure must be modified to be able to run to completion. With SciFinder Scholar, there is some built-in intelligence that offers to "autofix" a molecule that might give the system trouble. It is also wise to preview the SciFinder Scholar search to see what kinds of substances will be retrieved with the structure as drawn.

IV. Structure Searching on CrossFire

It is also possible to do very precise structure searching on the Beilstein CrossFire system.

CrossFire Structure Module

Unlike the STN system, where the type of structure search (exact, family, closed or full substructure; exact or substructure on SciFinder Scholar) determines the type of compounds retrieved, CrossFire requires the user to "set free sites" by indicating the number of substitutions allowed at given atoms or to make other choices at the time of structure drawing in order to broaden or narrow the scope of a search. Setting free sites is done either all at once in the Query Options menu (once the desired atoms have been selected) or atom by atom by choosing the precise number of free sites allowed for each atom. Other options are the inclusion/exclusion of isotopes and allowing substances that have a charge, radicals, etc. to be retrieved in the search.

As with SciFinder Scholar and other STN options for structure searching, CrossFire includes a number of template files to assist in building complex molecules. To be sure that you are properly drawing a functional group, choose it from the template file "residue.bsd". The template icon is just to the left of the Benzene ring when in structure-drawing mode. Once in a given template, you can use the File-Open option to see the other available templates.

CrossFire also allows predefined groups of variable atoms:

A = any atom
Q = any atom but C or H
M = a metal atom
X = halogen.

The addition of an H to these symbols means Hydrogen could also be one of the variable atoms. For example, XH implies that any of the atoms F, Cl, Br, I, or At plus H would satisfy the search. Likewise, there are generic group symbols to represent such things as carbocyclic or heterocyclic rings, alkyl, alkenyl, or alkynyl chain groups, etc. Finally, the user may define generic groups if the predefined groups are not sufficient.

V. Beilstein and Gmelin

Beilstein is for organic compounds, whereas Gmelin is for inorganic and organometallic compounds.

Beilstein covers compounds containing carbon along with the following elements:

          H          Li, Be              B, C,  N,  O,  F          Na, Mg                 Si, P,  S,  Cl          K,  Ca                     As, Se, Br          Rb, Sr                     Te, I          Cs, Ba

Compounds can be single components or salts and mixtures (if they have at least one organic component). Peptides are covered if they contain twelve or fewer amino acids. Polymers or polycondensation products are not treated. The following are not typically treated as Beilstein compounds:

Gmelin covers compounds not covered in Beilstein, i.e., inorganic and organometallic chemistry as well as related fields such as mineralogy and metallurgy. Compounds are indexed with terms such as coordination compounds, alloys, ceramics, and inorganic polymers.

VI. Beilstein Lawson Numbers

Compounds in the Beilstein database are also indexed by a number that indicates various structural features. That is the Lawson Number. It represents certain structural fragments and can be used for structural similarity searches. In general, the smaller the Lawson Number, the more common the fragment. Every substance in Beilstein has at least one Lawson number assigned to it. Dividing the Lawson Number by 8 puts you roughly in the Beilstein system number for the printed Beilstein volume that contains the compound. The compounds are divided into 3 major groups in the printed Beilstein Handbook:

  1. Acyclic Compounds, Volumes 1-4; System Numbers 1-449
  2. Isocyclic Compounds, Volumes 5-16; System Numbers 450-2358
  3. Heterocyclic Compounds, Volumes 17-27; System Numbers 2359-4720.

[Unfortunately, the Beilstein Institute never published the meanings of the 4,720 system numbers used to classify organic compounds.]

The Lawson Number is effective when used in combination with other search keys, such as molecular formula, element ranges, etc. It is also useful when combined with NOT in substructure searches.

Back to the table of contents

How and Where to Search: Specialized

Part 13: Chemical Patent Searching


I. Introduction

There is a large body of chemical information found in patents. In fact, the very first US patent was issued to Samuel Hopkins on July 31, 1790 for "an improvement, not known or used before such discovery, in the making of Pot ash and Pearl ash by a new apparatus and process." Now, the number of US patents is well over 6,000,000.

Patents are covered by Chemical Abstracts, and more than 15 percent of the document citations in CA are for patents. Most patents of interest to chemists cover compositions of matter (new chemical compounds, mixtures, pharmaceuticals) or processes (e.g., synthesis of a drug). Under US law, it is even possible to patent things such as 3-D atomic structures, structural databases, and their uses, which may result from the genomics field. Other types of patents are issued for machines, manufactures, plants, and designs. All types of US patents now receive protection for 20 years, with the exception of design patents (original ornamental designs), which get 14 years. With hundreds of thousand of patents issued annually by various countries, a great deal of effort is necessary to organize patent documents for effective retrieval.

Some abstracting and indexing services ignore the patent literature altogether, whereas others, such as the Beilstein Institute, have covered patents only at certain times in their history. [Beilstein's patent coverage ended around 1980; prior to 1960, Beilstein is a very good source of patent information.] A few commercial organizations, notably Derwent (worldwide) and IFI CLAIMS (US), specialize in patent coverage in some or all fields. Patents are part of a group of materials generally referred to as

INTELLECTUAL PROPERTY that also includes copyright and trademarks. 

II. What is a Patent?

A PATENT is a grant of property rights to an inventor by a government. The inventor gets the exclusive right to use or manufacture an invention for monetary gain or to exclude others from making, using, or selling the invention in the country (or countries) whose government(s) issued the patent(s). To obtain a patent, the inventor must file certain documents, and the invention itself must exhibit the qualities of novelty, utility (usefulness), and invention (unobviousness).

 

Patent searching is often undertaken in order to prove novelty, a process known as

PRIOR ART SEARCHING. In this case, the older patent literature is quite important. A second type of patent search involves INFRINGEMENT. In this case, the search must be exhaustive, but is limited to the last 20 years or so. 

An invention must also be useful. For a chemical, that might mean that it shows a beneficial property, such as a pharmacological effect, or it might be an intermediate that is used in synthesizing a product that has an end use.

 

Patents for chemical inventions must also meet the criterion of invention (un-obviousness). This means that the invention cannot be obvious to someone who is "skilled in the art."

Patents will not be granted for an invention that has already been publicly revealed. Even a posting on the Web can negate the criterion of novelty. In the US, an inventor could not file for patent protection if an article were published about the invention more than one year before the filing is made. Much of the information in the patent literature is, in fact, never published in any other format. However, some chemists denigrate patents as information sources since the titles, descriptions, and claims tend to use general, broad terminology, rather than the precise wording typically found in journal articles or other forms of primary scientific literature.

 

The inventor is the PATENTEE, and the inventor may assign the patent protection rights to another person or company, the ASSIGNEE. The inventor submits the first patent application on a certain date known as the PRIORITY APPLICATION DATE. Under the "Paris Convention for the Protection of Intellectual Property of 1883," the priority date is also considered to be the date a patent application is filed in any country that has signed the Paris Convention (as long as the inventor files the application in the other country within 12 months of the priority date). This results in a PATENT FAMILY of publications related to the invention. Some of these may be patents (perhaps in a language that is easier to read than that of the original), whereas others merely document the invention disclosed by the applicant as of the priority date. Regional patenting bodies, such as the European Patent Office, issue patents for groups of countries, and the Patent Cooperation Treaty permits a single filing to initiate the patenting process in a number of countries.

 

The PCT now provides for filing in over 100 countries. The importance of the priority date becomes very obvious when cases arise where two companies file for the same invention at about the same time. Such was the case when ICI Ltd. filed EP 399731 with a priority date of 23 May 1989, and Merck & Co. filed EP 400974 on 30 May 1989.

Patent searching is complex, and a basic understanding of the patenting process is necessary in order to comprehend the different types of documents involved. Most countries publish patent applications 18 months after filing, and those generally start with the letter "A". In addition, a second patent document may be issued after the patent is granted, designated with the letter "B" in general. WO patent numbers are assigned to applications filed through the Patent Cooperation Treaty.

III. What is a Patent Specification?

When the inventor applies for a patent, the PATENT SPECIFICATION must be submitted. It is a technical document that contains a description of the invention. A typical patent document will include drawings, background of the invention, a summary and a detailed description of the invention, examples, and one or more CLAIMS that define what is legally covered by the invention.

Most countries allow the inventor to define the legal limits of the patent in the claims in both generic terms and specific terms. For chemical patents, generic inventions usually take the form of a MARKUSH STRUCTURE, a structure that contains one or more structural variables based on a list of stated alternatives. Each compound that could be constructed from the list is covered by the claims.

IV. How Long Does a Patent Last?

In the past there was a great deal of variation of the term of protection afforded by patents in different countries. The members of the World Trade Organization (formerly GATT) have now agreed to recognize patents in all fields of technology for a 20- year period that begins with the priority date. On June 8, 1995, the new term took effect in the US. Prior to that date, patents were issued for a term of 17 years.

V. Abstracting and Indexing of Patents: Major Patent Databases

Since patents for a particular invention may appear at different times in a number of countries, abstracting and indexing services generally have adopted the practice of abstracting only the first patent issued, called the BASIC PATENT. Later patents in the patent family are indexed as EQUIVALENT PATENTS. Database producers are not always in agreement when it comes to a definition of basic and equivalent patents. On STN members of a patent family have a common priority application number and date. Further complicating the patent family situation are the patent types that may result in related patents. These may be considered a:

The U.S. Patent and Tradmark Office offers free databases for searching of patent information published January 1, 1976 onward. Searches can be performed only by text terms (or numeric identifiers for the patents themselves). Search collections include patents (and published patent applications), expired patents, and trademarks.

USPTO provides access to US patents classified according to the US Patent Classification code. Another system is the International Patent Classification code issued by the World Intellectual Property Organization.

INPADOC, the International Patent Documentation Center, covers about 60 country patent offices as well as international patent bodies, such as WIPO and EPO (the European Patent Office). Therefore, the INPADOC database is a major source of patent family data and forms the basis for the Chemical Abstracts Service's printed patent indexes, and now the data found in the CA and CAPlus files.

IFI CLAIMS files cover U.S. chemical patents from 1950.

Derwent is one of the world's largest patent services, covering over 30 countries plus the European Patent Office and Patent Cooperation Treaty countries. The WPI (World Patents Index) database covers pharmaceuticals from 1963 and other categories of chemicals for the periods shown below:

Derwent also has the Derwent Patents Citation Index covering US Examiner's citations from 1984 and EPO and PCT Examiner citations from 1978.

See STN's How to Search WPINDEX.

Questel-Orbit offers Markush structure searching in the WPIM and PHARMSEARCH files. PHARMSEARCH covers US and European pharmaceutical patents from 1984 to the present, and WPIM has all of the Derwent chemical and pharmaceutical patents. The Merged Markush Service is a structure file produced by INPI (the French Patent and Trademark Office) and Derwent Information Limited that can be searched by generic structure input. MMS includes all of the structures previously available in the MPHARM (Markush PHARMSEARCH) and DWPIM (Derwent World Patents Index Markush) files. MMS also includes all the compounds from the Derwent Chemical Resource (DCR). A new database on Questel-Orbit is PlusPat, a source of worldwide patent data that extnds back into the 19th century. PlusPat merges the European Patent Office (EPO) DOCDB file, (which is the basis for EPO databases - including Esp@cenet), with information from additional Questel-Orbit databases.

VI. CAS Patent Databases on STN

Chemical Abstracts Service's coverage of patents began with the print CA in 1907. Enhanced coverage of patents was initiated around 1960. In 1999, CAS began to include patent family information in its CAplus file. There are two types of patent family relationships in CAplus.

Patent coverage is extremely fast in CAplus now, with most chemical patents making it into the database within two days of issue.

STN now has a Markush database in the MARPAT file (1988- ). MARPAT provides access by structure to all the specific and generic substances claimed in patents via Markush structures from 1988 to the present.

 

USPATFULL has the full text of all patents issued by the USPTO from 1975 (with partial coverage from 1971-74). The US Patent Office began to publish US patent applications on March 15, 2001, so the USPATFULL database now includes those documents. They are distinguished from the granted patents by their kind code, A1. CAS indexes chemical patents in USPATFULL, including CAS Registry Numbers.

VII. Examples of Chemical Patents

Note the increase in the US patent numbers issued during the years covered by the table below.

Issuing Body

Patent Number

Year

Title

USPO

2,167,351

1939

Piperidine compounds and a process of preparing them

USPO

2,248,018

1941

4-Aryl-piperidine-ketones and a process of preparing them

USPO

2,479,295

1949

Process and culture media for producing new penicillins

USPO

2,840,577

1958

17-Thio derivatives of Estratrien-3-ol and of Estratetraen-3-ol

USPO

3,903,094

1975

Piperidyl glycolates

USPO

4,150,158

1979

Oxadiazindione derivatives useful as insecticides

USPO

4,284,794

1981

Prostaglandin derivatives

EPO

0 274 909 A2

1987

Hydrocarbon oxidations catalyzed by azide- or nitrid-activated metal coordination complexes

USPO

4,900,871

1990

Hydrocarbon oxidations catalyzed by iron coordination complexes containing a halogenated ligand

USPO

5,245,036

1993

Process for the preparation of 4-phenoxyquinoline compounds

Sample CAPlus Entry for A Chemical Patent (Partial Record) 
Process for the preparation of 4-phenoxyquinoline compounds useful as fungicides.

Robery, Roger L.; Alt, Charles A.; DeAminis, Carl V. (Dow Elanco, USA). U.S. (1993), 4 pp.

CODEN: USXXAM US 5245036 A 19930914 Patent written in English. Application: US

92-879488 19920507. CAN 120:30682 AN 1994:30682 CAPLUS 
Patent Family Information 

Patent No.

Kind

Date

Application No.

Date

US 5245036

A

19930914

US 1992-879488

19920507

JP 06041083

A2

19940215

JP 1993-124804

19930430

EP 569021

A1

19931110

EP 1993-107392

19930506

R: CH, DE, DK, ES, FR, GB, IT, LI, NL

Priority Application Information

 

Date

US 1992-879488

 

19920507

Abstract 

4-Phenoxyquinolines I [R1, R2 = H, halo, alkyl, haloalkyl; R3, R4 = H, halo], which are known plant fungicides (no data), are prepd. by etherification of corresponding 4-haloquinolines with phenols HOC6H3R1R2 in the presence of a catalytic amt. of a 4-dialkylaminopyridine. The catalyst accelerates an otherwise sluggish reaction, thereby shortening batch cycle times and effectively enlarging manufg. capacity. Thus, reaction of 0.01 mol 4,7-dichloroquinoline with 0.014 mol 2-CF3C6H4OH in refluxing xylene contg. 0.0015 mol 4-dimethylaminopyridine showed complete conversion after 24 h. Isolation of the product via formation, filtration, and neutralization of the HCl salt, gave 89% I (R1 = 2-CF3, R2 = R4 = H, R3 = 7-Cl). Two addnl. I were similarly prepd. in 86-93% yield and reaction times of 16-18.5 h. Also used as catalysts were 4-pyrrolidinopyridine and polyDMAP (polymer-bound 4-aminopyridine from Reilly Industries, Inc.).

VIII. Miscellaneous

Derwent Fragment Codes offer a way to conduct chemical searches on the Derwent files, such as WPI. Millions of patents were not indexed for searching by structure using graphic input, so the codes developed by Derwent need to be used. In the World Patents Index, these codes serve as links for all patents in appropriate subject areas as far back as 1963 for pharmaceutical patents. As noted above, WPI also has a Markush searching capability for more recent patents in the pharmaceutical, agrochemical, and chemical subject sections. They also have a polymer coding system covering 1966 onward.

The IFI Comprehensive Index and the IFI Uniterm Index provide a type of chemical substructure search capability for US patents to at least 1964. The American Petroleum Institute databases, APIPAT and APILIT, utilize a system of very broad fragmentation codes to index patents and technical literature, respectively. Coverage is not limited to petroleum, but extends to a wide range of petrochemicals, such as hydrocarbons and oxygenates. The files were purchased by Elsevier in 1999 through its Engineering Index division. They are now available on STN as ENCOMPPAT AND ENCOMPLIT, as are the IFI files.

Since patent rights can be assigned, it is sometimes difficult to determine who owns the rights to an invention. The USPTO keeps track of patent assignees, but new owners are not required to notify anyone when an invention has changed hands. There is an IFI CLAIMS/Reassignments file. In addition, there are databases that keep track of legal challenges to a patent, but those simply report the fact of the challenge. Legal databases, e.g., Lexis-Nexis or WestLaw, can be consulted for the outcomes of such challenges.

To obtain a copy of a US patent (and other patents) in pdf format, try: esp@cenet.

 

Back to the table of contents

 

Part 14: Analytical Chemistry


I. Introduction.

Chemists of all types need to be able to identify with certainty the substances they have made, extracted from a source, or sampled in some manner. In some cases, the species they are testing exist for very short periods of time as intermediates in chemical reactions. Whether they are trying to determine the sequences and structure of biomolecules with molecular weights in the hundreds of thousands or attempting to detect minute quantities of a small molecule that is present as a few parts per billion, analytical chemistry provides many of the tools and techniques to find the answers. Separation science is one area of concern, whether the technique be chromatography, electrophoresis, centrifugation, or some other method of separation.

Spectral databases and compilations in all ranges of the spectrum (UV/visible, infrared, microwave, etc.) as well as data compilations that result from newer spectral techniques are all available to assist in the identification of an unknown substance or the confirmation of a reaction product.

Many areas of science and technology must be called upon to perfect workable techniques for some of the problems the analytical chemist encounters. These include engineering, geology, environmental science, physics, optics, computer science, electronics, and others.

 

An ANALYTE is the substance to be identified, detected, or separated in some manner. A MATRIX is the sample or medium in which the analyte is analyzed.

Sometimes the searches in this area involve seeking out particular pieces of data, and other times they require the use of STANDARD METHODS of analysis to insure that chemists in diverse operating environments obtain the same results on the same samples. The methods may involve sampling techniques, sample preparation, methods to separate or purify a sample, and methods to identify a pure substance or the components of a mixture. Many of these methods are gathered in books or series that have distilled the most reliable and accurate techniques from other types of chemical publications. At times it may be necessary for the analytical chemist to create a derivative of the analyte in order to form a more volatile or more thermally stable substance that can be separated. The technique is particularly important in chromatography.

II. Encyclopedias, Dictionaries, Data Compilations, and Treatises.

Encyclopedias
The 10-volume Encyclopedia of Analytical Science (1995) covers three broad areas:

Other multi-volume works are the Encyclopedia of Analytical chemistry and the Encyclopedia of Separation Science.

The Encyclopedia of Nuclear Magnetic Resonance (1996) in 8 volumes contains 720 authoritative articles, the first 200 of which cover the history of this important technique. The encyclopedia appeared approximately 50 years after the first successful NMR experiments on condensed matter. It covers all aspects of NMR.

Among the more specialized encyclopedias that have recently appeared is the 3-volume Encyclopedia of Spectroscopy and Spectrometry (2000). Although the articles are arranged as a traditional encyclopedia in alphabetical order, the editors provide a separate contents list by topics:

Furthermore, each article is flagged as either a "Theory," "Methods and Instrumentation," or "Applications" article.

The 10-volume Encyclopedia of Mass Spectrometry was to commence publishing in 2001.

Dictionaries
A number of one-volume dictionaries appeared in the 1980s for the fields relevant to analytical chemistry, among them:

The definitive source for nomenclature of analytical chemistry is the IUPAC publication Compendium of Analytical Nomenclature (1987).

The large data compilations Beilstein Handbook of Organic Chemistry and Gmelin Handbook of Inorganic and Organometallic Chemistry contain much data of interest to analytical chemists. Now that database versions of these are available, it is easy to determine if a particular piece of analytical data exists for any of the millions of compounds in the databases. The two databases are found on the systems of the major vendors STN International and DIALOG, as well as in the versions searchable with the Beilstein CrossFire system.

Treatises
The largest continuing treatise in analytical chemistry is Wilson and Wilson's Comprehensive Analytical Chemistry. Over thirty volumes of the treatise had been published by the end of 1999. It appears that the 2nd edition of another treatise, the Treatise on Analytical Chemistry, (1978-) has stalled. The 14 volumes of the first part came out between 1978 and 1986.

III. Standard Methods, Handbooks, and Smaller Works.

One of the most popular continuing methods series is Techniques of Chemistry (1971-).  Other specialized titles with important information for analytical chemists who work with biomolecules include Methods of Enzymatic Analysis in 12 volumes and Methods in Enzymology, a continuing series that now numbers in the hundreds of volumes. Included in the latter title are volumes that deal with basic theory, sources of equipment and reagents, and methods for DNA sequence analysis, among many others. Methods in Enzymology is now available on CD-ROM, and a related journal, Methods, is also published.

The Official Methods of Analysis of the A.O.A.C. (Association of Official Analytical Chemists) is the place to look for many of the methods used in testing substances in industry. For example, one finds here a method for determining the refractive index and water content of honey. Major sections are devoted to fertilizers, disinfectants, drugs in feeds, distilled liquors, dairy products, and color additives. Over 2,300 methods are available. Some of the types of information found in the work are:

In addition, the work has an in-depth subject index. It is available in loose-leaf or CD-ROM formats.

A much larger work, the Annual Book of ASTM Standards, appears each year with the latest word on how to test various materials. It is also good for definitions of certain industrial substances, for example, fuel oil:

ASTM Standard
D396-98D396-98 Standard Specification for Fuel Oils.
 

The first volume of the ASTM set is the index. There are sections devoted to such areas as:

ASTM standards are now on the Web, and a subscription can be placed for as few as 50 copies/year. ASTM also produces the ASTM International Directory of Testing Laboratories.

Specialized works of this type include Standard Methods for the Examination of Water and Wastewater and the NIOSH Manual of Analytical Methods.

Examples of relevant handbooks are:

The last-named work includes a UV absorption index (with increasing values of λmax from 250-795 nm and the solubility in water of many stains, dyes, and indicators.

IV. Spectral Compilations.

Spectral analytical techniques encompass the full range of electromagnetic radiation. The type of radiation involved in producing a spectrum usually gives its name to the spectral technique.

Types of Spectra and the Transitions They Engender

Name

Wavelengths

Transitions

Radio-frequency

10-1 - 103 meters (m)

Molecular rotations, NMR

Microwave

0.1 - 30 centimeters (cm)

Molecular rotations, ESR/EPR

Infrared

2.5 - 50 micrometers (μm)

Molecular vibrations

Visible

400 - 800 namometers (nm)

Electronic excitation (atomic)

Ultraviolet

200 - 400 nm

Electronic excitation (molecular)

X-ray

0.05 - 1 nm

Ionization

Gamma

< 0.05 nm

Nuclear transitions and disintegrations

Moving down in the table above, one finds increasingly shorter wavelengths, resulting in higher energies. Thus, the energy of a given type of electromagnetic radiation is inversely proportional to its wavelength.

A spectrum may be depicted as a plot of the intensity of radiant energy emitted or absorbed versus the energy of the radiation. The energy is usually represented by the wavelength or frequency. Another method of representing spectra is to record a series of numbers that measure the peaks of the emission or absorption spectra. Either or both methods may be found in the databases and reference works that contain spectral data.

One can find new manifestations of certain types of spectra with the introduction of Fourier Transform techniques. Aldrich has libraries of both FT-NMR and FT-IR spectra.

Another spectral technique, not in the table above, is Raman spectra. These yield information by using lasers as the radiation source in the far infrared-visible region of the spectrum.

Spectroscopy also embraces the technique of mass spectrometry, wherein the instrument measures the distribution of charged particles produced after ionization, rather than radiation that is emitted or absorbed. The gas-phase ions are separated according to their masses or ratios of mass to charge (m/z). The mass spectrometer's beam of high-energy electrons thus causes organic molecules to ionize and fragment. It then separates the mixture of ions by their m/z ratios and records the relative abundance of each ionic fragment. The resultant plot of ion abundance versus m/z resembles spectra produced by other techniques.

Mass spectra are among those found in the NIST (National Institute for Standards and Technology) Chemistry WebBook, which had in its February 2000 release:

The Mass Spectrum of Isatin is reproduced below:

Isatin Mass Spectrum from NIST

In mass spectroscopy, as in other types of spectral depictions, a researcher really needs to know what types of compounds or groups yield peaks that match the measured spectrum. Most collections are indexed by the name of the compound or by molecular formula. The Important Peak Index of the Registry of Mass Spectral Data lists by m/z value the first, second, and third most abundant peaks in the Registry, covering over 50,000 compounds. The Wiley Registry of Mass Spectra Data is the largest commercially available collection of mass spectra, with over 275,000 spectra. A smaller, very popular collection of over 75,000 spectra is the NIST/EPA/NIH Mass Spectral Database.

Two companies that have produced a number of standard spectral collections are Bio-Rad Sadtler and Sigma-Aldrich.

The older printed Sadtler collections of Infrared and NMR spectra share a common index that also covers other printed compilations such as Varian and JEOL NMR sets. The references to NMR spectra in those sets are indicated by a "V" and a "J" respectively.

Checking the Sadtler Alphabetical Index for isatin, one finds:

         PRISM  |  GRATING  |     UV    |   NMR   |   C-13


Isatin    2204  |    304    |    590    |  17050  |   6606

The first two columns refer to IR spectra. Both 60 Mhz NMR and C-13 NMR spectra are covered in the indexes. Other Sadtler indexes are:

The Sadtler libraries can be purchased on CD-ROM, and there are laboratory devices that include the Sadtler collections for comparison to measured spectra in the lab. In addition, Sadtler's HaveItAll IR option offers over 200,000 spectra of pure organic and commercial compounds on one CD-ROM. It also includes 3,300 Raman spectra.

Nicolet Instruments Corporation and Galactic Instruments Corporation have developed a pay-per-use spectral library service, FTIRsearch.com. Over 71,000 FTIR and 16,000 Raman spectra are included. Other collections of electronic spectra are offered by companies such as Fiveash Data Management. SpecInfo is a database of more than 660,000 proton, C13, MS, and IR spectra that is now available on the Web. Included are the Wiley Registry of Mass Spectra and the NIST Mass Spectral Database. There is also an STN version (formerly called C13NMR/IR).

There are many reference works on spectra. Despite the availability of the many compilations of spectra, it is often impossible to find a needed spectrum in any of them. Databases such as Chemical Abstracts, Beilstein, or Gmelin may then be of use in identifying a source in the primary literature.

V. Crystallography.

The Cambridge Structural Database is the largest collection of crystal structure data in the world. It covers organic and organometallic crystal structures from 1935 onward. Well over 200,000 structures are now in the file. The CSD contains bibliographic information, 2-D chemical connectivity depictions, and superb 3-D visual depictions of the molecules, as shown below in the Conquest version of the database, using the 3-D Visualiser.

CSD depiction of isatin

It has information on the preferred shapes of molecules and the preferred interactions between different molecules and organic functional groups. Both 2D- and 3D-structure searching is possible with the CSD, in addition to pharmacaphore searching. A pharmacophore is the specific 3D arrangement of functional groups within a molecular framework that is necessary to bind to a macromolecule or an active site in an enzyme.

There is also an Inorganic Crystal Structure Database.

VI. Biomolecule Sequence and Structure Databases.

The last few decades have witnessed an explosion of growth in data files associated with efforts to solve the sequence and structures of proteins, nucleic acids, and other biomolecules. Each year the journal Nucleic Acids Research has in the first issue published that year a guide to the databases of interest to molecular biologists. Categories of databases include:

The Protein Data Bank and GenBank are two of the better known databases for biomolecules. There is a service from the National Library of Medicine called Entrez that links via the Internet the relevant references from the Medline database to the databases of biomolecular sequences.

VII. The Special Review Issues of Analytical Chemistry and Other Reviews.

In alternating years, the American Chemical Society journal Analytical Chemistry published for many years special issues devoted to "Application Reviews" and "Fundamental Reviews." Applications such as air pollution, food, forensic science, particle size analysis, and water analysis are among the topics in the former, whereas thermal analysis, chemical sensors, ion-selective electrodes might be topics found in the latter. These review articles appeared for at least 50 years.

Another major review serial is Methods of Biochemical Analysis.

VIII. Abstracting and Indexing Journals and Databases.

A large number of specialized A&I services can be found for analytical chemistry, including:

Several of the A&I services can now be had on CD-ROM or searched as online databases. Analytical WebBase (incorporating Analytical Abstracts), produced by the Royal Society of Chemistry, has comprehensive coverage for all aspects of analytical chemistry, including instrumentation and applications.

Analytical WebBase search for fuel oil as a 
matrix

Analytical Abstracts covers more than 260 journals in 12 languages, manufacturers' application notes, and Australian and British standards, as well as new books.

 

Back to the table of contents

 

Part 15: Physical Property Information


I. Introduction.

What is the difference between a chemical and a physical property of a substance? There is no clear-cut answer. In general, you may find one source that calls a given property a physical property and another source that calls it a chemical property. Melting point, boiling point, density, and other such properties typically found in one-volume data handbooks are usually considered physical properties. Other properties, such as reaction yields, types of crystals formed, what solvent a substance dissolves in, etc. are usually thought to be chemical properties. Let's not worry about such distinctions in this course. However, you do need to be aware that certain types of data are not covered in this lecture. For example, spectral data sources are discussed entirely in the lecture that covers analytical chemistry and constitutional chemistry, and environmentally relevant data sources are found in the section on chemical safety or toxicology information.

 

Physical property data are difficult to find. Over time, the names of the properties of interest may have changed, they may be reported in units that are different from those sought, or the conditions (temperature and pressure, for example) under which the published data values were measured do not correspond to those of interest. Sometimes an approximate value is known (or at least a general idea of the magnitude of the expected value) before a data search is begun. Nowadays, there are sources available to predict physicochemical properties based on input parameters.

 

It is important to remember that data most often appear first in the primary literature, then, through a process of evaluation and selection of the most reliable data, are collected into the various compilations discussed in this lecture. It is not unusual to find several different values for the same physical property, so a choice sometimes has to be made among conflicting values. At such times, the searcher has to take into account several factors, including the reputation of the publisher of the compilation or the reputation of the author or research institution that made the original determination.

 

There are many one-volume sources of data or handbooks as they are usually called. Frequently the handbooks are derived from (copied from) larger multi-volume data compilations. These larger sources are produced by data centers or information analysis centers whose job it is to critically evaluate the data found in the primary literature. The critically evaluated data compilations may have an indication of the degree of reliability of the data, such as:

The large data compilations all provide references to the original primary literature where the data were first published. Small handbooks do not give the original literature citation in most instances, but may indicate where in a large data compilation the copied data are found. Thus, you can track down the original source of data if you suspect an error in transcription may have occurred. For example, the CRC Handbook of Chemistry and Physics will refer you to the volume of the Beilstein Handbook of Organic Chemistry for its source of the data, and Beilstein will have the references to the original primary data that they copied.

II. Guides, Indexes, and Directories of Data Sources.

There are several books that describe data sources in detail, among them The Search for Data in the Physical and Chemical Sciences (1984) and the CODATA Directory of Data Sources for Science and Technology (1985). Another guide is Handbooks and Tables in Science and Technology (1994). Now that databases have begun to appear with large sets of data in them, it is possible to determine if a particular property exists in a database for a substance of interest. For example, on the STN International system, the NUMERIGUIDE file looks across many of the hundreds of databases included in that vendors' offerings and indicates the databases that have the property. For a given database, such as the Beilstein CrossFire system (or the Beilstein Database on STN), it is possible to search not only for a match of a given substance and a property, but also to search the database for the property itself and a particular value or range of values to identify the substance(s) that match the search criteria.

III. Landolt-Börnstein Numerical Data and Functional Relationships.

Landolt-Börnstein (L-B for short) is the largest printed compilation of numerical data in existence today, with over 280 volumes in existence. It covers many areas of interest to chemists, but unfortunately suffers from disuse by many chemists who have not studied German. There are now English-language subject and chemical substance indexes that assist in locating a table in the many volumes of the set. Despite the appearance of a CD-ROM version of the indexes starting in 1996 (including an Index of Organic Compounds), the printed L-B is still a difficult set to use. There is now a full database version comparable on the Web.

Data in Landolt-Börnstein covers:

IV. Beilstein Handbook of Organic Chemistry and Beilstein CrossFire.

If you are looking for a physical property of an organic substance or a two-dimensional depiction of it, the printed Beilstein Handbook of Organic Chemistry or corresponding database is the place to look. Despite the word "handbook" in the title, Beilstein is certainly not a one-volume work, not even a 10-volume work, but one whose volumes number in the hundreds! Beilstein covers the beginning of organic chemistry in the late 18th/early 19th centuries to the present. Although the coverage of the print volumes is considerably behind the present date, the currency of the database is quite good, within a year of the current literature. Not all of the data in the later years may yet have been critically evaluated, but in many cases Beilstein gives the value(s) for the physical properties. Even for those properties in the last decade or two that do not have actual values in the database, an indication that the property may be found in the primary literature and the references to the original literature are given.

There are dozens of physical properties reported in Beilstein. Since it covers in excess of 6.5 million organic compounds, it is a prime source for data mining, the use of a database to compile a data set that previously did not exist or to look for data that corroborates a hypothesis one may have formed.

The capability to search for substances having certain properties or a range of numerical values of properties is inherent in the CrossFire database (and other implementations of the Beilstein database), so it is of particular use in searching for organic materials with a given set of properties. Think of how valuable this might be when combined with the capability to conduct exact structure or substructure searches across the millions of compounds in the database.

The following types of information are given in Beilstein:

Compounds are arranged in the printed Beilstein in a manner that keeps acyclic, isocyclic, and heterocyclic compounds together and always in the same relative place in each time period covered.

Since the original printed Beilstein Handbook is in German, one often encounters German terms or abbreviations to indicate the property reported. Therefore, it is necessary to know the German equivalents of English-language physical property data terms. There is a comprehensive index for the original Beilstein and the first four supplements (coverage through 1959) with a German-language chemical substance name index (v. 28) and a chemical formula index (v. 29). Volumes covering the literature after 1959 are in English.

 

V. Dictionary of Organic Compounds and Related Products.

The Chapman & Hall company originally published a series of printed and CD-ROM products with titles that begin Dictionary of .... These are now available online from the CRC CHEMnetBase. The works are really data compilations or larger handbooks, but in a certain sense are dictionaries. That is to say, they tend to collect together in the same section of the printed works chemical substances that are structurally similar and therefore may have similar common names. Particularly significant features of the Dictionary of Organic Compounds are the structural depictions of the substances and properties of derivatives, as well as references to the original literature for synthesis, spectra, etc. of the compound.

Collectively, the dictionaries cover these areas:

Also available is the Merck Index on CD-ROM, and it features the capability to search by structure or substructure of the substance. Both the Merck database and certain of the dictionaries are also available from the online vendors.

VI. Gmelin Handbook of Inorganic and Organometallic Chemistry.

Beilstein does not routinely cover organometallic compounds. Those are the purview of its sister publication, the Gmelin Handbook of Inorganic and Organometallic Chemistry. With the same degree of comprehensiveness as Beilstein, Gmelin supplies the largest single source of information and data on inorganic and organometallic compounds. The arrangement of Gmelin is by element. Information includes:

The Gmelin system ranks substances in order of their tendency to behave as metals. Compounds are arranged in Gmelin according to the principle of last position, determined by the location of the element's number in the Gmelin system. For a given substance, Gmelin provides information on the occurrence, methods of preparation, physical properties, and chemical properties.

 

Gmelin is available as an online database on the commercial vendors' systems, but is also being marketed now as a database that can be searched with the same CrossFire system used to search the Beilstein CrossFireplusReactions database by structure and other parameters.

VII. Encyclopedias and Books as Sources of Physical Property Data.

Many books (monographs, treatises, and textbooks) include tables of data. Unfortunately, there is no comprehensive index that might tell you which book has a particular data compilation. Sometimes a general search of a library's OPAC (online catalog) for a class of compounds will turn up promising titles that can be checked for data. Treatises, since they cover broad subject areas, are potentially very good sources, but are often poorly indexed or lack a comprehensive index for the entire set.

 

Encyclopedias, on the other hand, are excellent sources of physical property information. The most important chemical encyclopedia is the Kirk-Othmer Encyclopedia of Chemical Technology, now in its 4th edition. This work is also available as a CD-ROM product and as an online or Web database.

 

Other examples of encyclopedias relevant to this area are the Encyclopedia of Physical Science and Technology (3rd ed., 2002) and such specialized encyclopedias as the Polymeric Materials Encyclopedia (1996) and the Encyclopedia of Advanced Materials (1994).

VIII. Journals as Data Sources.

It would seem that journal articles would be excellent sources of physical property data, and for standard data that are routinely reported, they are. The problem is that indexing of physical property data contained in journal articles is not always well done by the abstracting and indexing services. However, there are some journals that are specifically designed to publish data. Two of those are the Journal of Physical and Chemical Reference Data (1972-) and the Journal of Chemical and Engineering Data (1956-). The former title has comprehensive indexes for each group of 10 volumes. Sometimes, an earlier, separately published work will be updated in a journal article or articles. For example, the 1958 Chemical Society publication Tables of Interatomic Distances and Configurations in Molecules and Ions (Special Publication no. 11 and its supplement Special Publication no. 18 published in 1965) were updated in the Journal of the Chemical Society, Perkin Transactions 2

1987, Supplement pages S1-S19 for the organic portion and in the Journal of the Chemical Society, Dalton Transactions 1989 Supplement pages S1-S83 for the organometallic and coordination complexes. 

One technique to locate data in journal articles is to perform a search that includes terms in the abstracts of the articles in a bibliographic database such as the Chemical Abstracts CA File on STN or SciFinder. Frequently, the actual values of the most important data will be included in the abstracts. Another technique is to search the fulltext files of the electronic versions of the journals themselves. As more and more scientific journals become available in electronic format, this should prove to be an increasingly important approach, especially as vendors find more innovative ways to search across journal titles. The American Chemical Society now offers free the Supporting Information for articles published in ACS journals. This is data that is too voluminous to publish in the articles themselves. A number of other publishers take similar approaches.

IX. Smaller Data Compilations and Handbooks.

There are many, many handbooks published in science and technology. Perhaps the best known of those in the physical sciences is the CRC Handbook of Chemistry and Physics (now published annually). The CRC Press publishes dozens of handbooks in various areas and has produced a Composite Index for CRC Handbooks (1991) that leads to the appropriate work where a substance and a property may be found. There is also a CD-ROM version of this title. Other famous one-volume handbooks are:

Some of these are now made available on the Web by knovel.

The user of more general one-volume handbooks such as Lange's or the CRC, should be aware that they cover only the most common 10,000-15,000 chemical substances and include for the most part the same data for those compounds. Furthermore, they all take the data from standard sources such as Beilstein or Gmelin. Hence, it is usually not productive to search through many of them in hopes of finding a value when one has failed to provide the answer. If possible, start your search with larger sources such as Beilstein, Gmelin, or the Chapman & Hall/CRC dictionaries. If those are not available, you will find larger numbers of substances covered in some of the specialized sets, such as the 3-volume Handbook of Data on Common Organic Compounds (1995). Corresponding to that work is v. 5.0 of the CD-ROM product Properties of Organic Compounds, a database covering over 27,000 organic compounds that is searchable by structure. There is also an online version on STN and DIALOG, and it is also available on ChemWeb.Com, as are the Chapman and Hall Dictionaries.

A much larger number of chemical substances can be found by using ChemFinder.Com on the Web courtesy of CambridgeSoft. ChemFinder.Com has over 75,000 compounds taken from over 350 indexed sites. It offers several modes of searching, including name, molecular formula, molecular weight, CAS Registry Number, and structure. The NIST WebBook is another popular Web site that has data for over 6,000 organic and inorganic compounds. Also included are thermochemical and spectral data. NIST is the National Institute of Standards and Technology (formerly, the US National Bureau of Standards). It publishes a number of data compilations and Standard Reference Data Sets.

Another interesting Web site, though with relatively few compounds (2483 currently) is the Organic Compounds Database at Colby College. It is search by name, molecular formula, or data values for properties such as melting point, index of refraction, formula, absorption wavelengths, mass spectral peaks, and type of chemical substance.

X. Sources From knovel.

The McGraw-Hill product AccessPerry's is available from knovel: http://www.accessperrys.com/ AccessPerry's covers thousands of facts, figures, formulas, tables, graphs, and calculations for chemistry and physics. Exact values or ranges of values of physical properties can be used as search keys in addition to names of substances or physical properties. Included are three handbooks: Perry's Chemical Engineers' Handbook, Lange's Handbook of Chemistry, and Chemical Properties Handbook.

knovel has done a great service to the scientific community by providing free versions of some standard physical property sources. They are:

Free Databases from knovel

 

 

Back to the table of contents

 

Part 16 : Searching for the Synthesis or Reactions of Specific Compounds or Classes of Compounds (Reaction Chemistry)


I. Introduction.

Reaction chemists are interested in a variety of information when planning a synthesis. That may include the conditions under which the reaction is to occur, the starting materials and reagents, catalysts, reaction sites, yields, products, by-products, functional group transformations, bonding changes, and mechanisms of the reactions. [A reaction MECHANISM is "a detailed description of a particular reactant to product path, together with information pertaining to intermediates, transition states, stereochemistry, the rate-limiting step, electronic excitation and transfer, and the presence of any loose or intimate electron ion pairs." (Ash, 1985)] A combination of some or all of these concepts may provide a path to the needed information, depending on the secondary source that is used. Once a compound has been synthesized, a variety of analytical and physical property techniques may be used to verify that the correct substance has been made.

One way to search for reaction information is by the name of the reaction. That may be a more general name, such as a substitution reaction, or it may be an eponym from the name of the chemists(s) who first developed the synthetic method, such as the Curtius Rearrangement Reaction. Other search systems have developed around codes for various key features in the reaction.

One can, of course, search large databases, such as Chemical Abstracts, for reaction information. But in recent years, a large number of specialized reaction chemistry databases have come on the scene to assist in a search for reaction information. In addition, there are a number of specialized printed sources that codify the many discoveries in the reaction chemistry area. Some of the most important sources are discussed in this lecture.

II. Beilstein and Gmelin.

We have already encountered the Beilstein Handbook of Organic Chemistry and the Gmelin Handbook of Inorganic and Organometallic Chemistry when discussing analytical and physical property information needs. These two comprehensive sources also have preparation and formation information in them. Of course, for the older volumes it is necessary to cope with the text in German, but here is a list of selected German words for reaction chemistry that should assist you.

         German                      English
        
             Abbau                        degradation
             Alkylierung                  alkylation
             Ausbeute (an)                yield (of)
             Ausgangsmaterial             starting material
             Ausgangsverbindung           starting compound
             ausgehen(d) von              to start(ing) from
             behandeln                    to treat
               beim Behandeln               on treating
               nach Behandlung              after treatment
             bereiten                     to prepare
             bestandig                    stable
             bilden                       form
               bildet sich                  is formed
             Bildung                      formation
             Darstellung                  preparation
             Derivat                      derivative
             einbringen                   to introduce
             einleiten                    to feed, pass into
             Einschlussverbindung         inclusion compound, clathrate
             eintragen                    to introduce
             enthalten(d)                 to contain(ing)
             ergeben                      to result
             Ersatz                       substitution, replacement
             Folge                        sequence
             formulieren                  to formulate
             geben                        to give
               gibt                         gives
             Gemisch                      mixture
             Geschwindigkeitskonstante    rate constant
             Gewinnung                    isolation, extraction
             Hauptprodukt                 main product
             herstellen                   to prepare, manufacture
             Herstellung                  preparation, manufacture
             hervorgehen (aus)            to originate (from), result 
                                           (from)
             hindern                      to inhibit, prevent
             Katalysator                  catalyst
             kunstlich                    artificial
             liefern                      to give, deliver
             Nebenprodukt                 by-product
             Praparat                     preparation
             Quelle                       source
             Raumtemperatur               room temperature
             Reagenz                      reagent
             Reinheit                     purity
             Reaktionsfolge               reaction sequence
             Rohprodukt                   crude product
             Saure                        acid
             Salz                         salt
             Schluss                      conclusion, end
             Seitenkette                  side chain
             Sprengstoff                  explosive
             Stoff                        matter, substance
             Summenformel                 stoichiometric formula
             uberfuhren (in)              to transform into
             Uberfuhrung                  transformation
             ubergehen                    to be converted into
             Umsetzung                    reaction
             umwandeln                    to transform, convert
             Umwandlung                   transformation
             ungesattigt                  unsaturated
             unrein                       impure
             unvollstandig                incompletely
             Verbindung                   compound
             vereinigen                   to combine, join
             Verfahren                    procedure
             Verseifung                   saponification
             versetzen                    to add
             Versuch                      experiment
             vollstandig                  complete(ly)
             zersetzen (sich)             to decompose
             Zwischenstufe                intermediate

 

 

A search of the Beilstein CrossFireplusReactions Database for isatin shows that the record for the compound had at the time of the search 2003 entries for Reaction and 6 entries for Isolation from a Natural Product. It also included 15 Derivatives. One of the references is:

------------------------------------------------------------------------------------------------
| Reaction ID 123539
| Reactant BRN 131835 3-hydroxy-1,3-dihydro-indol-2-one
| Product BRN 383659 indole-2,3-dione
| Reaction Classification Preparation | Reagent ammonium nitrate, acetic acid
|
| Ref 1 1046338; Journal; Klein; JACSAT; J.Amer.Chem.Soc.; 63; 1941; 1474;
------------------------------------------------------------------------------------------------

Note that the Beilstein Registry Number (BRN) for isatin, 383659, is not the same as the Chemical Abstracts Service Registry Number for the compound, 91-56-5. Not all compounds in Beilstein have CAS RNs, and it is not recommended to search the database that way. In fact, the preferred way to search for a compound in Beilstein is by structure.

Thus, we see that it is possible to find reaction information using the compound name or structure if the record for the compound is among the many millions of compounds currently in the database. An impressive feature of Beilstein CrossFire for reaction searching is the capability to input the structure of a starting material and/or the structure of a desired product and conduct a reaction query. When at least two depictions of chemical substances are drawn on the screen and one is selected, the reaction editing component of the system appears, giving the option to define the selected substance as either a Reactant or a Product. It is even possible to graph one or more atoms from the Reactant molecule onto a position in the Product in order to insure that the reaction(s) found in the search will allow the desired modifications to occur only at specific points on the molecule.

Isatin structure search on Beilstein as product

At the time of the graphical search, 60 reactions were found in which isatin was a product, and the first of those is shown below.

Result of isatin search on Beilstein

The Gmelin database has now joined the CrossFire family of searchable databases. With v. 6.0 of the CrossFire software and the Gmelin database at the beginning of 2002, it is now possible to do graphical reaction searching in Gmelin. The database may also be searched on commercial vendor systems, such as STN International.

III. Methods of Organic Chemistry (Houben-Weyl).

Since 1995, the famous German set Houben-Weyl Methoden der Organischen Chemie has been published in English by Thieme, publisher of a number of synthetic chemistry books and journals. The volumes are quite expensive, and only the most affluent academic chemistry libraries are likely to have maintained a subscription to Houben-Weyl over the years. For example, to purchase a ten-volume paperback edition of the English-language volumes of Houben-Weyl on the topic of "Stereoselective Synthesis" cost DM 2995 until January 31, 1997 (DM 3600 after that date). A new edition of the set, re-titled Science of Synthesis: Houben-Weyl Methods of Molecular Transformations began to appear in 2000 and will be published in 48 volumes over the next ten years. The database began to be available in 2001.

Houben-Weyl is organized by class of compound or functional group to be synthesized. Thus, it is structured according to the type of product formed, with only the principal function considered. This is determined by the level of oxidation, substitution or saturation of the carbon atom(s). In general, higher oxidation levels have priority over lower ones and heterofunctional atoms or groups are classified according to a priority list that puts CX3 at the top and P at the bottom. In Houben-Weyl, transformations of the principal functional group are illustrated by typical examples. Patents are included in the scope of coverage, and the important preparative methods for all classes of compounds are presented. A formula index and an index of substance classes were issued in 1986-87 (v. XVI/1-2). Included are the following indexes:

Each volume of Houben-Weyl contains a bibliography of important review articles or books, as well as author and subject indexes.

IV. Treatises, Collected Works, and Other Important Works.

Inorganic Reactions and Methods, which began publication in 1986, is now complete. Volumes cover ways of forming bonds with inorganic elements, methods of effecting various types of reactions, and methods of characterizing the compounds. The set is well indexed. Of particular note is a permuted formula index which groups all compounds containing a given number of an element in one section of the index. Another important inorganic set is Synthetic Methods of Organometallic and Inorganic Chemistry, which began pulication in 1996. A longstanding inorganic series that has appeared since 1939 is Inorganic Syntheses , for which a collective index of volumes 1-30 of the series covering 1939-95, has now appeared. The Encyclopedia of Inorganic Chemistry in 8 volumes (1994) covers all aspects of inorganic chemistry, including information on reactions and bonding energetics.

Pergamon Press has published a large number of multi-volume treatises on various areas of chemistry over the years:

A major thrust of the Pergamon treatises is syntheses and reactions. Each of the sets has a distinguished editorial board and is thoroughly referenced to the original primary literature. A feature of some of the treatises is an index of review articles and specialist textbooks relevant to the topic. A few of the earlier Pergamon treatises, Comprehensive Heterocyclic Chemistry and Comprehensive Medicinal Chemistry, have been made into searchable databases by MDL Information Systems Inc. (formerly, Molecular Design Ltd.). MDL also produces the Available Chemicals Directory (ACD), a database that combines the catalogs of dozens of chemical suppliers.

Organic Syntheses began publication in 1921 and is published annually. The procedures are cumulated into collective volumes (with revisions if necessary). Nine of the collective volumes covering annual volumes 1-74 have been published to date. The articles are sufficiently detailed so that the reactions described can usually be carried out without consulting the original primary literature. In later years, the emphasis has been on model compounds and procedures that illustrate important types of reactions. The 1991 Organic Syntheses Reaction Guide covers experimental procedures in Collective Volumes 1-7 and annual volumes 65-68. Eleven broad classes of reactions are used to index the reactions. MDL also makes available to its customers the ORGSYN database that covers the entire Organic Syntheses collection.

In 2001, CambridgeSoft.Com, Wiley, and DataTrace publishing company put a free version of Organic Syntheses on the Web. It requires the installation of the free ChemDraw Net plugin from CambridgeSoft, but once that is in place, even structure searches can be performed. Additional ways of searching the database include CAS RN, molecular formula, and chemical name, plus author and keyword searching. Below is the result of a structure search on Organic Syntheses for Isatin.

Results 1 of Isatin Search on Organic 
Syntheses on the Web

Part of the page from the third reference is shown below.

Result of Isatin Search on Organic 
Syntheses on the Web

It is often useful to find a description of a reagent to determine whether it has been used in the preparation of a given compound or a compound that is similar to the one which is to be made. A recent set is the 8-volume Encyclopedia of Reagents for Organic Synthesis (1995). The preface to the work notes the "vital need to know which reagent will perform a specific transformation." Over 3,000 reagents are arranged alphabetically by IUPAC name. The set is indexed by formula, structural class, function, and subject indexes. Beginning in 2001, a new version of EROS is available only on the Web as a subscription item from Wiley. Prior to the appearance of EROS, the venerable Fieser and Fieser's Reagents for Organic Synthesis (1967-) had been the standard source for reagents. It is especially useful for planning a synthesis and for answering questions that may arise in the course of a synthesis. One can find information on methods of preparation or purification, uses, suppliers, and reaction diagrams. Indexes for reactions, methods, authors, and subjects are provided.

An offshoot of EROS is the four-volume Handbook of Reagents for Organic Synthesis (2000). Intended as a lower-cost reference work that would be available in or near laboratories, the Handbook includes information from the original 8-volume work. It covers in separate volumes: "Reagents, Auxiliaries, and Catalysts for C-C Bond Formation," "Oxidizing and Reducing Agents," Acidic and Basic Reagents," and "Activating Agents and Protecting Groups."

Larock's Comprehensive Organic Transformations: A Guide to Functional Group Preparations appeared in 1989 and covers the literature through 1987 in one volume. Almost 15,000 reactions and over 23,000 references are included for reactions that yield over 50%. A CD-ROM version of the first edition of the work appeared in 1997, and a revised print edition published in 1999 extended the literature coverage for selected journals through 1995.

More thorough coverage of "The Chemistry of Functional Groups" can be found in the volumes that bear that series name. Entire volumes are dedicated to topics such as The Chemistry of Alkenes (1964), The Chemistry of Enols (1990), etc. A useful means of finding an appropriate volume in the series is Saul Patai's Guide to the Chemistry of Functional Groups Series.

Bretherick's Reactive Chemical Hazards is the place to look for information on all types dangerous reactions. Included is every chemical for which documented information on reactive hazards has been found over the years. With the fifth edition, Bretherick's began to appear in CD-ROM as well as print formats. The online version of the database on ChemWeb is both text- and structure-searchable.

V. Reaction Databases and Specialized Abstracting Services.

As noted above, it is possible to find reaction information in more general A&I services, such as Chemical Abstracts. However, there are a number of special services that are devoted to reaction or synthetic chemistry. As such, they pay more attention to the aspects of the literature that are important to the reaction chemist. Among the specialized A&I services are:

The Index Chemicus database reports on over 200,000 new compounds each year. Those are found in 110 of the leading natural products, agrochemical, organic, and medicinal chemistry journals. Index Chemicus, which began publication in 1960, is unique in that it includes unisolated synthetic intermediates in the database. Biological activity is one of the properties indexed. In addition to the printed Index Chemicus, the product is now available on CD-ROM for Windows and as an MDL ISIS or Oracle database. Coverage of the database begins with 1993.

The companion product Current Chemical Reactions split off from Index Chemicus in 1979. CCR has examples of reaction types. The database, which exists in MDL's ISIS/Base or REACCS formats and Daylight Chemical Information Systems' reaction system, covers over 250,000 reactions since 1986. It allows you to find synthetic routes by searching the substructure of the product.

A hybrid in the reaction chemistry product line of the Institute for Scientific Information is the Reaction Citation Index. The database provides search avenues via reaction or bibliographic data from Current Chemical Reactions, but has the unique feature of cited reference searching with references taken from ISI's Science Citation Index bibliographic database. An ISI product that can be used on much smaller computers is the CD-ROM database ChemPrep. Covering the literature from 1985 to the present, ChemPrep consists of three quarterly, cumulative updates and an annual archival disc. The software for drawing and searching structures is included.

Derwent's Journal of Synthetic Methods began publication in 1975 as a successor to a loose-leaf service that supplemented the annual Theilheimer's Synthetic Methods of Organic Chemistry. It has long been available as a database on ORBIT and is now on the STN International online system as file DJSMDS/DJSMONLINE. There is also an MDL REACCS version of JSM. Another option is the CD-ROM version covering 1975-94 and monthly updates since. The database gives a structure-searchable approach to novel synthetic methods from the worldwide patent (about 14 percent of the total citations) and journal scientific literature. Only new methods or the most synthetically useful modifications of known methods are abstracted. Consequently, only about 3,000 reactions each year enter the database. All references are cross-referenced to relevant prior art and other similar reactions. On STN, the page images of the printed Derwent Journal of Synthetic Methods are included for all records. The annual Theilheimer volumes also continue to be published. MDL has over 46,000 reactions from volumes 1-34 (1946-81) of Theilheimer as a REACCS or ISIS database.

Also on STN is the SYNTHLINE Drug Synthesis Database that contains the schemes for synthesis of drugs currently on the market or in development worldwide from 1984 to the present.

MDL has also developed some specialized reaction databases marketed only to customers of the MDL ISIS/Base or REACCS products. These include METALYSIS for transition-metal-mediated chemistry and CHIRAS, a database of asymmetric synthesis. Produced jointly with Fachinformationszentrum Chemie GmbH (FIZ Chemie) is MDL's Current Synthetic Methodology (CSM) database. CSM covers innovative and significant reactions from 1992 onward. The reaction data in CSM is a subset of the much larger ChemInform RX database. Wiley-VCH's ChemInform is a weekly abstracting journal that publishes some 20,000 abstracts each year. It is available in print, CD-ROM, or Internet versions.

With the fall of the Soviet Union a huge database of single-step reactions, whose source articles contain synthesis data, became available in several formats. Produced by the USSR Academy of Sciences All-Union Institute of Scientific and Technical Information (VINITI) and the Zentrale Informationsverarbeitung Chemie (ZIC) in Berlin, InfoChem originally contained over 2.2 million compounds, with over 1.8 million reactions from 1975-88. Now enhanced, the SPRESI structural and reaction database is provided via the Internet by InfoChem GmbH. The entire InfoChem database with 2.5 million reactions is also available in CD-ROM format from Springer-Verlag. A subset containing 370,000 reaction types searchable by structure is available as the CHEMREACT file on STN. In 2001, CAS integrated the data with its CASREACT database. Smaller structure-searchable subsets have been produced by Springer-Verlag for PC platforms. For the Windows environment, these are called ChemReactXX, with the number of reactions indicated by the XX in the title. For example, ChemReact41 is a Windows CD-ROM set with 41,000 high-yield reaction types. (Versions with 10,000 and 32,000 reactions are also sold). For the MDL ISIS/Base or REACCS environments, the PC product is known as ChemSelect. Other variants of the database include a version that works with Chemical Design Ltd's ChemRXS and the Daylight Chemical Information Systems versions, Spresi95 and Spresi95Preps.

VI. CASREACT.

The CASREACT file on STN has reaction information from the organic sections of Chemical Abstracts from 1907 to the present. Originally only journal articles were covered in CASREACT, but since January 1991, patents are also included. The patent coverage was substantially increased in 2001 by the addition of records from 1974-1984 that are derived from the ZIC/VINITI data file provided by InfoChem. Both single-step and multi-step reactions are included. No reactions involving inorganic or coordination compounds or polymers are found in CASREACT. However, for the documents covered, the records actually contain structure diagrams for reactants and products. CAS Registry Numbers are found for reactants, products, reagents, solvents, and catalysts. The yields are given for many products, and textual information also describes the reactions. Mapping of atoms between reactants and products is possible in a structure search.

CASReact is one of the databases accessible through CAS's SciFinder Scholar product. Once a structure is drawn, you have a number of options to refine the retrieved records, including:

The images below show the sequence of a search for isatin and the carboxylic acid functional group in a reaction.

SFS reaction search for isatin 1

SFS reaction search for isatin 2

SFS reaction search for isatin 3

VII. Chemical Abstracts and CA Databases.

When searching Chemical Abstracts or the corresponding database, you need to keep in mind the criteria for indexing a chemical substance. A compound is indexed in CA only if:

Thus, even though an article might include the preparation of derivatives, if the only purpose was to facilitate isolation or analysis of a parent compound, no index entries would be made for the derivatives (unless some new data about them appeared in the article). CAS chooses the most specific term possible for indexing purposes, for example, the name of a specific compound over the class of compounds to which it belongs and the name of a more specific class over a more general class of substances. A class entry (for example, esters) is made in indexing a document only under certain circumstances:

The substance class heading is chosen to be as specific as possible, for example, aromatic hydrocarbons instead of the broader term hydrocarbons if the document discusses only aromatic hydrocarbons.

In the printed CA Chemical Substance Index, at the very beginning of the entries for a given compound, you will often encounter abstract numbers immediately following the compound name and CAS Registry Number. There is no accompanying text to indicate what information is contained in the document. This CAS convention implies that the preparation of the substance is the main thrust of the document. Thus, in the 1987-1991 Chemical Substance Index is found:

-----------------------------------------------
| 1H-Indole-2,3-dione (isatin) [91-56-5]
| 112: 7313u, 198038r
| absorption spectra of, in relation to dimethylindigo,
| 115: 258278t
| etc.
-----------------------------------------------

The subject of the first two items is apparent from the titles:

112: 7313u An improved sythesis of isatin. Alam, M.; Younas, M.; Zafar, M. A.; Naeem (Dep. Pharm., Univ. Punjab, Lahore, India). Pak. J. Sci. Ind. Res. 1989, 32(4), 246 (Eng).

112: 198038r Convenient preparation of 3,3-dibromo-1,3-dihydroindol-2-ones and indole-2,3-diones (isatins) from indoles. Parrick, John; Yahya, Arbaeyah; Ijaz, Abdul S.; Yizun, Jin (Chem. Dep., Brunel Univ., Uxbridge/Middlesex, UK UB8 3PH). J. Chem. Soc., Perkin Trans. 1 1989, (11), 2009-15 (Eng).

Functional derivatives follow the entry for the parent compound (the Heading Parent) in the Chemical Substance Index.

VIII. CAS ROLES.

CAS Roles are indexing terms in the CA/CAplus databases. Roles originally were provided for database entries from midway in the v. 121 indexing (from October 1994). They were later applied via a computer algorithm to all database entries beginning with 1967. There are seven super roles and 38 more specific words used to describe the information that deals with the substances indexed in a document. One of the super roles is PREP (Preparation), which has more specific roles:

IX. Reviews of Reaction or Synthetic Chemistry.

Theilheimer's Synthetic Methods of Organic Chemistry begins each annual volume with a review article "Trends in Synthetic Organic Chemistry." The database Synthesis Reviews began to appear in 1995 as a supplement to the journal Synthesis. With the 1996 update, the database covers over 7500 articles and books and is supplied in EndNote Plus format. The annual survey Organic Reactions (1942-) covers well-defined reactions, such as name reactions. The latest volume has a cumulative list of all chapters in earlier volumes, and there are cumulative author and chapter/topic indexes that cover all volumes published.

A new journal from the Royal Society of Chemistry is Contemporary Organic Synthesis (1994-). The Royal Society also publishes two alerting services, Natural Product Reports (1987-) for bio-organic chemistry, and Methods in Organic Synthesis (1984-). In 2001, the RSC made the two products available to subscribers on the Web, with references from the year 2000 onward. See NPU and MOS.

X. Name Reactions.

The Fischer indole synthesis, the Claisen rearrangement, the Wolff-Kishner reduction, Beckmann rearrangement, and the Friedel-Crafts reaction--a person who is well versed in synthetic chemistry techniques may know all of these reactions. However, there are hundreds of others that bear the names of their discoverers, and sometimes a reference to them can be perplexing. To help in this regard, compilations of so-called name reactions have been published over the years. The most recent to appear is Hassner and Stumer's Organic Syntheses Based on Name Reactions (2nd ed., 2002). Another is Mundy and Ellerd's Name Reactions and Reagents in Organic Synthesis (1988).

 

Back to the table of contents

 

Part 17: Chemical Safety and Toxicology Information


I. Introduction

All too often we see news stories of chemical industry practices that have had negative effects on health or the environment or hear reports of serious accidents or spills involving chemicals. An item in Chemical & Engineering News (December 8, 1997, p. 17) reported on "Hanford tanks leaking to groundwater." Groundwater was being contaminated with liquid wastes that had leaked from the tanks at the former nuclear weapons plant in Richland, Washington. The public perception of chemistry is tarnished by such stories, so chemists have a responsibility to use the safest possible practices in handling chemical substances and disposing of them. The American Chemical Society's 1994 document "The Chemist's Code of Conduct" contains these statements:

"Chemists should actively be concerned with the health and welfare of co-workers, consumers, and the community."

"Chemists should understand and anticipate the environmental consequences of their work. Chemists have responsibility to avoid pollution and to protect the environment."

In this section, we will encounter printed and computer-based sources to help keep abreast of the hazards associated with chemical substances and to become aware of the rules and regulations that govern the use of chemicals.

A large number of acronyms are found in the health and safety area, for example, TLV (Threshold Limit Value). There are also quite a few numbers, in addition to Chemical Abstracts Registry Numbers, that are used to identify chemical substances that have been studied for their environmental or health and safety impact. A single chemical substance may have a DOT (Department of Transportation) number, an RTECS number (from the Registry of Toxic Effects of Chemical Substances), and several others assigned by various US, European, International, or other agencies.

Furthermore, new types of chemical data, not likely to have been seen before, will be encountered for chemical substances in the reference tools or databases discussed in this section. These include things such as the octanol/water partition coefficient (Kow), soil organic carbon partition coefficient (Koc), measures of nitrification inhibition, acute toxicity data, and others.

II. General Safety or Toxicology Information Sources

A good printed subject guide to environmentally-related topics is the Encyclopedia of Environmental Information Sources (1993). For over 800 subject areas, ranging from hazardous materials to alternative energy, the encyclopedia lists the major sources of information. Identified are specialized abstracting and indexing services, bibliographies, directories, encyclopedias and dictionaries, handbooks and manuals, online databases, and relevant organizations.

Mirroring the scope and depth of coverage found in the comprehensive treatises published by Pergamon in inorganic, organic, organometallic and other areas of chemistry is the 13-volume Comprehensive Toxicology, that appeared in 1997. The work covers toxicology from the molecular to the organismal level, including a review of the general principles of toxicology, test procedures and data evaluation, and biotransformation of chemicals. The bulk of the volumes, however, are devoted to the specific organ systems of toxicology. Volume 12 treats current concepts of carcinogenesis.

Works encountered in other contexts, such as the Merck Index or the Kirk-Othmer Encyclopedia of Chemical Technology include quite a bit of information on the hazardous or safety aspects of the chemicals discussed. However, there are a large number of specialized reference tools whose primary aim is to make the retrieval of such information very easy. Some of those are introduced below.

III. Hazardous Aspects or Toxicology of Chemicals

Sax's Dangerous Properties of Industrial Materials (3 v., 10th ed., 1999) covers over 23,500 toxic, carcinogenic, mutagenic, highly flammable, or potentially explosive substances. Included are health-related and physical property data. There are many pages of synonyms in several languages to assist in using the work. It is also indexed by CAS Registry Number. A CD-ROM version is also available.

Bretherick's Handbook of Reactive Chemical Hazards (5th ed., 1995) covers some 5,000 elements and compounds. Now published in two volumes, the first is devoted to specific information on the stability of the compounds or the reactivity of mixtures of two or more of them under various conditions. In volume 2 are found groups of chemical substances arranged on the basis of similarities in structure or reactivity. Stability data on single specific compounds, data on possible violent interaction between two or more compounds, general data on a class or group of compounds (or information on the identity of individual compounds in a known hazardous group), structures associated with explosive instability, and fire-related data are all included in the work. Information on how to use the handbook includes the important caveat "Do not assume that lack of information means that no hazard exists." Over 2,000 pages of information make this the work to consult first for hazardous reaction information. A CD-ROM version is also available, and it can be found on ChemWeb.com http://www.chemweb.com.

Patty's Industrial Hygiene and Patty's Toxicology, now in the 5th edition, collectively cover General Principles, Toxicology, and Theory and Rationale. The focus of the work in recent editions has been extended beyond the industrial workplace to environmental safety and hazard control. It is now also available online.

The Dictionary of Substances and their Effects (DOSE) covers over 4,100 chemicals that have been studied for environmental impact or toxicity. The second edition of the print product appeared in 7 volumes in 1999. All purchasers of the print edition of DOSE receive free site-wide access to a fully searchable Web database. DOSE includes results of recent carcinogenicity, mutagenicity, and environmental fate studies, as well as the latest regulatory requirements.

IV. Material Safety Data Sheets and Other Factual Sources

MSDSs or Material Safety Data Sheets are available from the manuafacturer of chemical substances. It is to the manufacturer's advantage to insure that all known hazardous aspects and recommended precautions are clearly laid out to the user of their products, and that is the primary purpose of the MSDS. A very useful guide to MSDSs on the Internet includes sample MSDSs and sources of MSDSs, both on the Internet and elsewhere. 

In 1983, the US Occupational Safety and Health Administration (OSHA) published the Hazard Communication Standard, requiring chemical manufacturers and distributors to provide MSDSs to their customers beginning in late 1985. Since 1993, chemical manufacturers in the US have followed a voluntary MSDS format that is endorsed by the American National Standards Institute (ANSI). The most important information appears at the top, followed by the chemical name, manufacturer, and composition. The third section identifies known hazards associated with the substance. First-aid measures are next, followed by fire-fighting measures.

The most critical parts of an MSDS are the human health hazards and acceptable exposure limits. However, data on human testing is rare, so MSDSs generally rely on animal test data. The foreign equivalents of the US MSDSs are much shorter. Those are called "International Chemical Safety Cards (ICSCs) and are published by the World Health Organization and the European Union.

Such documents as MSDSs and ICSCs are really geared toward the larger quantities of chemicals used in industry. For academic institutions, although MSDSs are still required on site, a book such as the US National Research Council's Prudent Practices in the Laboratory: Handling and Disposal of Chemicals contains much practical information. The book includes Laboratory Chemical Safety Summaries (LCSSs) that provide the same type of information as do MSDSs, but are geared for the laboratory user. Also found in the work are guidance on risk assessment and tips on how to work with laboratory equipment.

The Environmental Science Center of Syracuse Research Corporation makes available a searchable Physical Properties Database (PHYSPROP) that covers over 25,000 substances. An example of the output is below:

Kow of Isatin

V. The National Library of Medicine's TOXNET System

NLM's TOXNET (Toxicology Data Service) is a free service with access to many toxicology databases that formerly cost money to search. Included are:

The Hazardous Substances Data Bank (HSDB) contains over 4,500 chemical records, each of which can have as many as 150 or so fields of data, covering human health effects, emergency medical treatment, animal toxicity studies, metabolism/pharmacokinetics, pharmacology, environmental fate and exposure, environmental standards and regulations, chemical/physical properties, chemical safety and handling, occupational exposure standards and more. HSDB is peer-reviewed by a committee of experts, the Scientific Review Panel (SRP).

VI. The Canadian Centre for Occupational Health and Safety (CCOHS's) Databases

Among the numerous reasonably priced databases offered by CCOHS is RTECS, the Registry of Toxic Effects of Chemical Substances. RTECS was originally produced by the US National Institute for Occupational Safety and Health (NIOSH), but has now been outsourced to MDL. RTECS provides toxicological information with citations on over 140,000 chemical substances. Included are toxicological data and reviews, international workplace exposure limits, references to US standards and regulations, analytical methods, and exposure and hazard survey data. RTECS contains:

CCOHS also maintains an extensive Material Safety Data Sheet collection and other databases covering aspects of occupational safety and health that range to ergonomics/workplace design and psychological aspects of a safe workplace environment.

VII. MDL and CAS Databases

Incorporated into the results of substance searches on SciFinder and SciFinder Scholar is a link to Chemical Abstracts Service's CHEMLIST database, a source of regulatory information for over 220,000 chemical substances covering 1979 to the present. From the STN Database Summary Sheet of 10/03:

The CHEMLIST File (Regulated Chemicals Listing) contains chemical substances on national inventories, such as the Toxic Substances Control Act (TSCA) Inventory, the European Inventory of Existing Commercial Chemical Substances (EINECS) as well as new notifications on the European List of Notified Chemical Substances (ELINCS), the No-Longer Polymers List that was prepared by the EEC in accordance with the 7th Amendment of Directive 92/32/EEC, and the Canadian Domesticd Substances and Non-Domestic Substances Lists (DSL/NDSL), the Australian Inventory of Chemical Substances (AICS), and the Philippines Inventory of Chemicals and Chemical Substances (PICCS). The Korean Existing Chemicals List (ECL) is included with chemical names and designations of Toxic Chemicals and/or Specified Toxic Chemicals. CHEMLIST also contains the list of Existing and New Chemical Substances (ENCS) promulgated by the Kashin Act of Japan, which regulates chemical substances that are either manufactured or imported in Japan. The file includes two lists of chemicals regulated in Switzerland, Giftliste 1 (List of Toxic Substances 1) and the INVENTORY of Notified New Substances in Accordance with the Ordinance on Substances. CHEMLIST contains the list of Toxic Chemical Substances that is regulated under the Taiwan Toxic Chemical Substances Control Act of 1986. The 2001 proposed list of chemical substances that is to be regulated under the Israel Hazardous Substances Law and Regulations List is included. CHEMLIST includes substances subject to regulation under Title III of the Superfund Amendments and Reauthorization Act (SARA, Sections 110 and 313), the Resource Conservation and Recovery Act (RCRA), and Regulations List is included. CHEMLIST includes substances subject to regulation under Title III of the Superfund Amendments and Reauthorization Act (SARA, Sections 110 and 313), the Resource Conservation and Recovery Act (RCRA), as well as U.S. regulatory lists, such as the Occupational Safety and Health Administration (OSHA) Highly Hazardous Chemicals List, lists of the U.S. Department of Transportation, and some U.S. state lists. High Production Volume Chemical Lists from Australia, International Council of Chemical Associations (ICCA), Organization for Economic Co-operation and Development (OECD) and the United States, and the German Water Hazard Class Substance List are included.

The records contain substance identity information, inventory status, source of information, and summaries of regulatory activity, reports, and other compliance information.

CAS has also created a multidatabase product called TOXCENTER (Toxicology Center). It is a bibliographic database that covers the pharmacological, biochemical, physiological, and toxicological effects of drugs and other chemicals.

TOXCENTER is composed of the data from 18 other STN files, including:

The records in the file contain bibliographic data, abstracts, indexing terms, chemical names, and CAS Registry Numbers.

MDL incorporates in the Beilstein database a field for phamacological data that includes much relevant information. For example, the isatin record (BRN 383659) has information on MAO-inhibiting activity, acute toxicity, and other biological activity of the substance. Bear in mind that Beilstein should not be considered a comprehensive source of data, since such information began to be added to the Beilstein database only in the 1980s.

The EcoPharm module is an add-on to CrossFire. EcoPharm's pharmacological and toxicological data focus on:

VIII. The CIS: Chemical Information System

One of the first integrated, structure-searchable systems, this database had its roots in the old Chemical Information System, a joint project of the National Institutes of Health and the Environmental Protection Agency. The current database is a for-fee product, available from NISC, the National Information Services Corporation. the present CIS includes databases in the following areas:

Hazardous materials are classified as toxic, corrosive, ignitable, or reactive. They can be liquid, solid, or gas. 

IX. US Environmental Protection Agency and Other US Government Sources

The EPA maintains a Substance Registry System to assist in locating chemical and biological substances whose properties make them of concern to the EPA. Chemicals are identified by a Chemical Abstracts Service Registry Number (CASRN) or if not available, an EPA Chemical Identifier (EPA ID), a systematic name (generally the CAS 9th Collective Index Name), a molecular formula, a molecular weight, former CASRN references, synonyms, and information about regulations, EPA data systems, and other sources that list the chemical.

Other resources at EPA include:

Environmental Terms and Definitions, contains collections of environmental terms and definitions from a variety of sources, and can be searched by keyword, information resource, and organization. 

The gateway to most US federal government sources now is FirstGov. Environment, Energy, and Agriculture is one of the main categories that citizens can choose at this site.

X. Other Sources

The Royal Society of Chemistry has produced since 1981 a bibliographic database called the Chemical Safety NewsBase (CSNB). It covers health and safety hazards of chemicals in relevant industries. There are many other sources in which to find chemical safety, toxicological, or environmental information. Examples are specialized abstracting services such as Water Resources Abstracts, publications from organizations such as the National Fire Protection Association or the American Chemical Society, and the many sources available from governmental agencies such as the Centers for Disease Control or the Environmental Protection Agency, not to mention a wealth of publications in these areas by commercial publishers, such as Lewis' Dictionary of Toxicology.

Back to the table of contents

 

Miscellaneous

Part 18: Chemical History, Biography, Directories, and Industry Sources


I. Introduction

How do you find an address of a known chemist or chemical manufacturer? Who can supply a chemical in a needed quantity? How can you improve your chances of finding a job in chemistry? Answers to these and related questions can be found in the sources discussed below.

II. Historical Information

Sarton's A Guide to the History of Science, published in 1952, is the standard printed work in the field. On the Web is Doug Stewart's History of Science/Science Studies Reference Sources bibliography. The second edition of Milestones in Science and Technology (1994) is subtitled "The Ready Reference Guide to Discoveries, Inventions, and Facts."

For chemistry, Sturchio's The History of Chemistry: A Critical Bibliography (1985) provides excellent coverage, but more recent (1994) is the Bibliography on the History of Chemistry and Chemical Technology, 17th to the 19th Century, edited by Valentin Wehefritz. The 1998 revision of the Chemical Heritage Foundation's Introducing the Chemical Sciences is an introductory guide designed particularly for teachers and their students. The German-language work Chronologie Chemie by Sieghard Neufeldt is a timeline of the major events that have shaped chemical science. The most important writings in chemistry are summarized by year, and there is also much valuable information on topics such as the development of chemical nomenclature and older scientific periodicals.

A number of books on the history of chemistry have appeared in the past decade:

·         From Chemical Philosophy to Theoretical Chemistry (1993)

·         The Historical Development of Chemical Concepts (1991)

·         The History of Chemistry (1992)

·         Ideas in Chemistry (1992)

·         Murder, Magic, and Medicine (1994)

·         The Norton History of Chemistry (1993) same as:

o        The Fontana History of Chemistry (1992)

A relatively new journal in the field is Foundations of Chemistry (Philosophical, Historical, Educational and Interdisciplinary Studies of Chemistry).

For historical material on chemistry, a unique resource is the Royal Society of Chemistry's Library and Information Centre. The LIC has over 3,000 historical chemical books from the 16th-19th centuries and over 7,000 images of distinguished chemists.

III. Biographical Information

Biographical Sources: A Guide to Dictionaries and Reference Works (1986) is a good place to start for sources of information on of famous chemists and other scientists. It contains such things as birth and death dates, details of education, honors, positions held, and sometimes even family details.

The most important English-language compilation for scientists of all ages is the Dictionary of Scientific Biography, which has about 5000 biographies for scientists who are no longer living. The 24-volume American National Biography appeared in 1999. It includes biographies of more than 17,500 men and women. The standard source for biographical information on living scientists in the U.S. and Canada is American Men and Women of Science. Frequently revised since the original edition was published in 1906, the current edition always lists only living scientists. Hence it is important for libraries to retain all editions of the work. Of assistance in finding entries in previous editions of AM & WS is American Men and Women of Science Editions 1-14 Cumulative Index. AM & WS is also available as a database.

The Marquis Who's Who series of publications has long been a standard source for brief biographical information. Among their more specialized publications is Who's Who in Science and Engineering.

The Nobel Prize winners can be found on the Internet in the Chemistry Section of the Nobel Prize Internet site. A printed work with similar information is Nobel Laureates in Chemistry, 1901-1992. It includes biographies, photos, and references to the laureates' most significant publications, as well as their family backgrounds.

Women in Chemistry and Physics: A Biobibliographic Sourcebook covers 75 historical and contemprary women scientists, ranging in birth from 370 AD (the first noted woman mathematician, Hypatia) to 1941 AD (astrophysicist Beatrice Muriel Hill Tinsley). Included are three women who served as presidents of the American Chemical Society: Helen Murray Free, Mary Lowe Good, and Anna Jane Harrison. American Chemists and Chemical Engineers first appeared in 1976, with a second volume published in 1994.

IV. Directories of Scientists and Scientific Organizations

There is a Directory of Technical and Scientific Directories (1988). Although many Web sites provide access to a relevant organization or person, there is no single, all-encompassing world-wide directory or search engine that will allow us to find the needed address in all cases. Nevertheless, a search of Google, AltaVista, or one of the other popular search engines nowadays will often turn up a home page for a scientist or organization.

In print, the ACS Directory of Graduate Research is a frequently-published, reliable source of information on research universities in the US and Canada. It is on the Web as DGRWeb. A version of the 1993 and 1995 DGRs appeared on a CD-ROM entitled ACS Directories on Disk. It includes several other products: College Chemistry Faculties, the Chemical Sciences Graduate School Finder, and Chemical Research Faculties: An International Directory. The ACS Directory of Graduate Research is published every two years. The work is invaluable in assessing the type of research carried out at the major universities since it includes a bibliography of the recent publications of the faculty.

For organizations, the annual Directory of American Research and Technology is a key source. It is especially valuable for those seeking employment in a given area of the country since it includes a geographic index.

V. Directories of Suppliers of Chemicals and Chemical Laboratory or Plant Equipment

Again, the Internet has considerably facilitated the task of finding a chemical supplier, with companies such as Fisher Scientific and Sigma-Aldrich having their catalogs on the Web now. The series of directories with the title Chem Sources is an excellent printed source for U.S. and other suppliers. Chem Sources is a database, CSCHEM, on the STN system, and now has a Web presence.

Also on STN is perhaps the most nearly comprehensive collection of chemical catalogs. It is CHEMCATS, a database of catalogs from worldwide suppliers for commercially available chemicals, enzymes, proteins, and other biochemical substances. Included are catalogs from both manufacturers and distributors of chemicals, including such firms as Sigma, Fluka, and Aldrich's Rare Chemicals catalog, plus Ishihara, Maybridge and many others. In command mode, on the STN System, the command HELP SPP lists all of the suppliers included in CHEMCATS. The combination of a Registry File structure (or other search) plus an LC (locater field) for the presence of CHEMCATS will identify whether a chemical substance is available commercially. The PRICE display format facilitates comparison of the records from the various companies. CHEMCATS is also a featured component of the SciFinder and SciFinder Scholar products from CAS, with direct links from the chemical record to the suppliers' listings.

Companies with access to Molecular Design Limited's software are likely to have an enhanced version of the Fine Chemicals Directory, now called the Available Chemicals Directory, since it has added data on compounds available in bulk (over 25 kilograms).

There are many other directories of chemical manufacturers or suppliers, some with information on custom chemical manufacturing services. Among those are the ACS's Chemcyclopedia (now with a Web version) and the Synthetic Organic Chemical Manufacturers Association SOCMA Commercial Guide. Expensive services can be had from SRI (their Directory of Chemical Producers on Dialog) and Chemical Information Services (their Database and Directory of World Chemical Producers).

For equipment, a revered source found in most libraries is the Thomas Register of American Manufacturers. Covering approximately 150,000 companies, the Thomas Register is available online, on the Web, and as a CD-ROM product. Searches can by conducted by company name, product, Standard Industrial Classification (SIC) code, trade or brand name, geographical location, etc.

Not to be overlooked are the directories published as supplements to chemical or scientific news journals, such as Chemical Week Buyer's Guide issue, the Analytical Chemistry Lab Guide issue (now with a Web version), Nature's Directory of Biologicals issue, etc.

VI. Information on Chemical Industries and Businesses

The Royal Society of Chemistry's Chemical Business NewsBase was sold in 2001 to Engineering Information, an Elsevier Science company. The CBNB was launched in 1985 to provide worldwide chemical business news and information. Chemical Abstracts Service has since 1974 produced CIN (Chemical Industry Notes). Chemical business news in the areas of production, pricing, sales, facilities, products and processes, corporate activities, government activities and the people who work in those areas can be found in CIN. Covering more than 80 journals, trade magazines, newspapers, newsletters, government publications, and special reports, CIN is updated weekly. On STN, the database includes CAS Registry Numbers and a thesaurus for geographic terms.

Books on industrial chemistry include:

·         The Chemical Industry (1993)

·         Industrial Organic Chemistry (1993)

·         Introduction to Industrial Chemistry (1991)

·         Survey of Industrial Chemistry (1992)

The Kline Guide to the U.S. Chemical Industry is an expensive source that is available in print and CD-ROM. Published since 1971, the guide includes marketing, economic, and company information. The focus is more on the end-use markets for chemical categories that include many speciality applications and formulated products.

SRI's Chemical Economics Handbook can be found in most business libraries with an interest in chemistry. SRI International (formerly the Stanford Research Institute) is one of the world's leading chemical marketing research services. The source is worldwide in its coverage. It emphasizes commodities and goes into detail on the economics of producing a particular chemical commodity.

The American Chemical Council annually publishes the Guide to the Business of Chemistry. It contains a 10-year historical overview, with data such as shipments, inventories, volume of output, price indices, and financial performance measures, among others, for basic, specialty, life sciences, and consumer products chemicals.

Chemical news magazines, such as Chemical Week and Chemical & Engineering News frequently have special issues devoted to analyses of the industry as a whole or certain subareas of the chemical industry, such as coatings, polymers, etc.

 

 

Back to the table of contents

 

Part 19: Teaching and Study of Chemistry; Careers in Chemistry


I. Introduction

This lesson will lead you to materials and sources that will assist in either teaching or study of chemistry, and ultimately, in finding a job in this field.

II. Teaching of Chemistry

There are not a lot of books available to teach you how to teach chemistry, particularly at the post-secondary level. Attempting to fill that gap is a work by J. Dudley Herron The Chemistry Classroom: Formulas for Successful Teaching (1996). More general works are Teaching Science: A Guide for College and Professional School Instructors. (1991) and A Handbook for Teachers in Universities and Colleges: A Guide to Improving Teaching Methods. (1995). More traditional approaches are found in such journals as the Journal of Chemical Education, the Journal of College Science Teaching, and The Crucible. The Bibliography of Chemical Education Journals and newsletters from relevant professional groups, such as CHED (the newsletter of the ACS Division of Chemical Education), can also be of assistance. The JCE Index Online can be searched for author names and titles from 1924 onward, but a complete list of keyword index terms has been supplied for articles published since mid-1995. The Journal of Chemical Education's laboratory experiments are now easily accessible through the Project CHEMLAB database. Several printed sources of demonstrations are available, for example,

·         Chemical Demonstrations: A Sourcebook for Teachers (2 v., 1988)

·         Chemical Demonstrations: A Handbook for Teachers (4 v., 1983-92)

·         Tested Demonstrations in Chemistry (2 v., 1994).

For physical chemistry, you may want to consult Physical Chemistry: Developing a Dynamic Curriculum (1993).

A database for the broader field of education is ERIC, which has extensive coverage of relevant journal articles as well as research reports from 1966 onward.

At the college level, the ACS's Committee on Professional Training issues guidelines for certification of programs of chemistry instruction. Those can be found on the Web as: "Undergraduate Professional Education in Chemistry: Guidelines and Evaluation Procedures."

III. The Study of Chemistry

The ACS Directory of Graduate Research (DGRWeb) can be a great help in selecting a graduate school in the US or Canada. Issued every two years by the American Chemical Society Committee on Professional Training (CPT), it covers the main disciplines of chemistry, including biochemistry, medicinal chemistry, and chemical engineering.

The CPT has a number of publications on the Web, such as Planning for Graduate Work in Chemistry" (6th ed., 1997).

Many colleges subscribe to CollegeSource ONLINE, with over 23,000 catalogs from many colleges and universities. Both US and non-US institutions of higher learning are included. Peterson's is another standard source to help find information about college or university programs.

IV. Paths to Careers

Around the third week of October each year, Chemical & Engineering News publishes its "Career Planning Resources" section. There you will find a range of sources, some published by ACS, on topics such as salaries, professional societies, sources of information about employers, etc. Among them are many Internet resources (including the ACS's Office of Career Services site) and free brochures, such as "Tips on Resume Preparation" or its "What a Chemist Should Consider" series of publications, including hints such as "What a BS Chemist Should Consider Before Accepting a Job in Industry."

The ACS operates a National Employment Clearing House at each of the ACS regional and national meetings. Career assistance is provided in a variety of other ways, such as resume review, mock interview sessions, and career-related literature and videos. Also an ACS product, JobSpectrum.org is a comprehensive career Web site offering chemistry jobs and career development resources: data on degrees and employment in the chemical labor force; resources for more effective resume writing, interviewing, networking, and negotiating; chemical employment trends; and salary information. The ACS book Employment Guide for Foreign-Born Chemists in the United States is available free from the Department of Career Services. Other books that can assist in career planning, resume writing, and preparing for interviews include:

·         Alternative Careers in Science: Leaving the Ivory Tower (1998)

·         Best Resumes for Scientists and Engineers (1988)

·         Career Management for Scientists and Engineers (2000)

·         Guide to the Chemical Industry: Technology, R & D, Marketing, and Employment. (Ch. 18. "Getting a Job")

·         Job$ in the Drug Industry: A Career Guide for Chemists (2000)

·         Sweaty Palms-The Neglected Art of Being Interviewed (1993).

The weekly news journal Science has a career service called NextWave. The subscription service is devoted to scientific training and career development. It provides global news, profiles of emerging careers, and advice from experts and role models drawn from an international scientific community. Some of the resources of NextWave are available without charge. For example, the JobsNet section provides free access to the last four weeks of job ads in Science, plus links to other Web job sites.