1 Advanced Structure Search. 2 Structure search in BEILSTEIN

263
1 Advanced Structure Search

Transcript of 1 Advanced Structure Search. 2 Structure search in BEILSTEIN

1

Advanced Structure Search

2

Structure search in BEILSTEINStructure search in BEILSTEIN

<253 non-H atoms

Search type controls substitution:

EXA, FAMCSS

SSS

3

G-groupsG-groups

4

G-groupsG-groups

• Elements

• Shortcuts

• Variable groups (Ak, Cy, X, etc.)

• Structural fragments (that you draw)

• Other G-groups

5

G-groupsG-groups

C lC l

N, O, Me, X

Ring IsolatedBond all exact

6

G-groupsG-groups

C lC l

G 1 G1 = O,N,Me,X

Ring IsolatedBond all exact

7

Run a sss sample search

=> l1SAMPLE SEARCH INITIATED 18:20:45SAMPLE SCREEN SEARCH COMPLETED - 1648 TO ITERATE  60.7% PROCESSED 1000 ITERATIONS 14 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 30525 TO 35395PROJECTED ANSWERS: 173 TO 749

CSS SearchCSS Search

8

OMe

OMeCl

Cl

Cl

Cl

Cl

O P

S

CSS SearchCSS Search

9

Run a css full search

=> l1 css fullFULL SEARCH INITIATED 18:20:22FULL SCREEN SEARCH COMPLETED - 33296 TO ITERATE 100.0% PROCESSED 33296 ITERATIONS 4 ANSWERS SEARCH TIME: 00.00.01 L3 4 SEA CSS FUL L1  => d scan

CSS SearchCSS Search

10

Cl

Cl

OH

S

R

CSS SearchCSS Search

11

CSS SearchCSS SearchBe carefull :

You can pay CSS search as a SSS search and get results as a Family search.

Look at the following example:

12

CSS SearchCSS Search

=> Uploading C:\Program Files\stnexp\Queries\css.str chain nodes :7 8 9 ring nodes :1 2 3 4 5 6 chain bonds :4-7 5-9 6-8 ring bonds :1-2 1-6 2-3 3-4 4-5 5-6 exact bonds :1-2 1-6 2-3 3-4 4-5 4-7 5-6 5-9 6-8  L1 STRUCTURE UPLOADED

13

CSS SearchCSS Search=> s l1 full css L3 4 SEA CSS FUL L1 => d costCOST IN U.S. DOLLARS SINCE FILE TOTAL ENTRY SESSIONCONNECT CHARGES 0.37 0.52NETWORK CHARGES 0.06 0.12SEARCH CHARGES 160.90 160.90 ------- -------FULL ESTIMATED COST 161.33 161.54

14

CSS SearchCSS Search=> s l1 full fam L5 4 SEA FAM FUL L1 => d costCOST IN U.S. DOLLARS SINCE FILE TOTAL ENTRY SESSIONCONNECT CHARGES 0.74 0.89NETWORK CHARGES 0.12 0.18SEARCH CHARGES 64.00 64.00 ------- -------FULL ESTIMATED COST 64.86 65.07

 => l3 or l5L6 4 L3 OR L5

15

• Structural fragments (that you draw)

• Other G-groups

G-groupsG-groups

16

G-groupsG-groups

17

Search Question: Locate analogs of the following substances to be used as possible synthetic intermediates

N

R'

CH3

R' =

O

R'' , CN

R'' = NHCH3 OCH2Ph AK

G-groupsG-groups

18

Challenges

R’ represents several classes of cpds

The carboxy derivatives are further defined

Solution

Create a G-group

Embed a G-group within a G-group

G-groupsG-groups

19

• Draw the fragment(s)

• Label them as fragments: assign the @ to the points of attachment for each fragment

• Save the G-group

G-groupsG-groups

20

21

22

23

24

25

26

27

28

29

30

31

=> Uploading "advstr1.str" in the current fileL1 STRUCTURE UPLOADED=> D L1L1 HAS NO ANSWERSL1 STR

N

G2

Me

O

G1

NH Me

O

C H 2

P h

A k

@

@

@

@

1

2

3

4

G1 [ @1 ] , [ @2 ] , [ @3 ]G2 C N, [ @4 ]

G1G2

G-groupsG-groups

32

=> S L1 SS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L2 50 SEA SSS SAM L1

=> D SCAN L2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN Ethanone, 1-(5-chloro-1-methyl-1H-indol-3-yl)-MF C11 H10 Cl N O

A c

Me

C l

N

G2 = An acyl analog

G-groupsG-groups

33

L2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 1H-Indole-2-carboxylic acid, 3-cyano-1-methyl-, ethyl ester (9CI)MF C13 H12 N2 O2

G2 = A cyano analog

OE t

Me

C N

C

O

N

G-groupsG-groups

34

=> S L1 SSS FULLFULL SEARCH INITIATED 13:43:28 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 12107 TO ITERATE100.0% PROCESSED 12107 ITERATIONS 797 ANSWERSSEARCH TIME: 00.00.15

L3 1166 SEA SSS FUL L1

=> FILE CAPLUS

=> S L3/RCT 342 L3 2185962 RCT/RLL4 171 L3/RCT (L3 (L) RCT/RL)

=> D IBIB ABS HITSTR 38

Role indexing to limit retrievalRCT = reactant

Which substance from L3 is indexed to this record?

HITSTR provides hit CAS RN, index name and structure

G-groupsG-groups

35

• Drawing specific structures– Fragments with 2 points of attachment– Variable points of attachment on multiple rings

G-groupsG-groups

36

G-groupsG-groups

37

G-groupsG-groups

1. Draw all the fragments.

2. Label with @ point of attachments.

3. Create separate G-groups.

4. “Orient” fragments with 2 points of attachment during the SAVE operation.

38

G-groupsG-groups

The quinoline ring is drawn twice to account for 2 points of attachment.

•All fragments in G1 (Y) are given 2 points of attachment.•The amide fragment in Y is drawn twice, to account for both orientations.•Carbons are left open for substitution.

39

G-groupsG-groups

1. In the Define New G-Group dialog box, click Fragments.2. Use Next Fragment to navigate through the fragments in the structure, including the desired fragments.

40

G-groupsG-groups

The Ak variable was chosen from the Variables menu.

Fragments with 2 points of attachment show up as [*1-*2].

41

G-groupsG-groupsOrientation of a fragment takes place during the SAVE operation. It cannot be bypassed!

1

2

Two nodes are highlighted.Show Fragment is used to see a fragment and select the node attachment to G2.

42

G-groupsG-groups

3•Select the appropriate node for each fragment. •In the case of the amide (an unsymmetrical fragment), the node selection is carefully chosen to orient the selected node to the highlighted node in the structure window (G2).

43

G-groupsG-groupsEach end of each fragment is highlighted during query verification. This shows the orientation of the fragment to the rest of the structure.

44

Search Question: Locate benzyl substituted N-containing

ring systems described by compound AA

N

R

N

N

N

N

N

AA

R = N

N - C

NULL

CH 2

Ph

G-groupsG-groups

45

Challenges

R is describing 3 ring systems

How to describe “NULL”

Solution

Create a G-group, with fragments, embedded in ring

•Start with a five members ring

•Use a [0-1] repeating group on G in a six members ring

G-groupsG-groups

46

• Draw the 5-member ring desired

• Draw fragments to account for other ring sizes

• Label the fragments with two points of @ttachment

• Define the G-group; put it in the ring

• Verify the orientation of the fragments

G-groupsG-groups

47

Challenges (cont)

The N-C fragments must be orientated to retrieve 1,3 systems

Solution

Use two Points of Attachments and assign orientation during the SAVE process

G-groupsG-groups

48

G1

NN

N

@ @

@ @

C H2P h1 2

3 4

G1 C , [ @1 - @2 ] , [ @3 - @4 ]

G-groupsG-groups

49

G-groupsG-groups

50

G-groupsG-groups

51

=> Uploading "advstr2.str" in the current fileL6 STRUCTURE UPLOADED=> DL6 HAS NO ANSWERSL6 STR

G1

NN

N

@ @

@ @

C H2P h1 2

3 4

G1 C , [ @1 - @2 ] , [ @3 - @4 ]

G-groupsG-groups

52

=> S L6 SS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L7 50 SEA SSS SAM L1=> D SCANL7 50 ANSWERS REGISTRY COPYRIGHT 1999 ACSIN 2H-1,3-Diazepin-2-one, o o oMF C36 H36 N4 O4 S

A seven membered Diazepine ring

P h

P h

N

ON

HO

OHO

N

S

H 2 N

RS

S

R

G-groupsG-groups

53

L7 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 2-Pyrrolidinone, 1-(benzoyloxy)-4-methyl-5,5- bis(phenylmethyl)- (9CI)MF C26 H25 N O3

A five membered pyrrolidine ring

Me

P h

P h

P h

C H 2

C H 2N

O C

O

O

G-groupsG-groups

54

• Draw the 6-member ring desired

• Draw fragments to account for other ring sizes

• Label the fragments with two points of @ttachment

• Define the G-group; put it in the ring

• Use a [0-1] repeating group on G

• Verify the orientation of the fragments

G-groupsG-groups

55

G-groupsG-groups

G1

N

N

N

0-1

CH2 Ph

12

3 4

G1 [@1-@2],[@3-@4]

56

G-groupsG-groups

=> fil reg => Uploading C:\Program Files\stnexp\Queries\w8.str L1 STRUCTURE UPLOADED => l1SAMPLE SEARCH INITIATED 13:07:26SAMPLE SCREEN SEARCH COMPLETED - 12012 TO ITERATE L2 20 SEA SSS SAM L1

57

G-groupsG-groups

(CH2)4

Me

MeOOMe

Me

Me

MeMe

MeO

OMe

Ph

OO

NHO

O

H

H

E

R R

RS

SS R

R

58

G-groupsG-groups

MeO

Me

Ph

Ph

N

ON

HO

HO

NH

O

N

RSSR

59

G-groupsG-groups

MeO

Ph

Ph

F

OON

N OH

SR

SR

60

Hydrgen inG-groups

Hydrgen inG-groups

61

G-groupsG-groups

Ring IsolatedBond all exact

HH

H

H

Cl

HH G1

Cl

H

H

H

G1 H,Me,OH

62

G-groupsG-groups=> l1 sss fullL3 12 SEA SSS FUL L1

Cl

ClCH2 C NH

O

C

NH

NH2G1 = H, OH, Me ??????

How can you get the desired compounds?

63

G-groupsG-groups

Ring IsolatedBond all exact

=> l1 css full L3 7 SEA CSS FUL L1

Cl Cl

G1G1 H,Me,OH

64

G-groupsG-groups

=> l1 css full L3 7 SEA CSS FUL L1

Cl

Cl

OH

S

S

Cl

Cl

OH

R

S

Cl

Cl

OH

ClClRS

Cl

Cl

OH

S

R

ClCl

ClClRR

65

G-groupsG-groups

Ring IsolatedBond all exact

=> l1 sss fullL3 739 SEA SSS FUL L1

Cl Cl

G1G1 H,Me,OH

66

G-groupsG-groups

MeCl

Cl

Cl

ClCl

CH

OH

Cl

ClCl Cl

ClCl

Br

Br

Br

Br

Cl Cl

Cl ClCl

P

O

OO

S

CC

OO

OHOHCl

Cl

Cl

Cl

67

G-groupsG-groups

G2Cl Cl

H

Me

OH

H

H

H

1

2

3

G1G2 [@1],[@2],[@3]

68

G-groupsG-groups=> l1SAMPLE SEARCH INITIATED 13:05:49SAMPLE SCREEN SEARCH COMPLETED - 1642 TO ITERATE  60.9% PROCESSED 1000 ITERATIONS 4 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 30410 TO 35270PROJECTED ANSWERS: 4 TO 284 L2 4 SEA SSS SAM L1

69

G-groupsG-groups

Cl Cl

Cl

Cl

Br

OH

RR SR RS

ClCl N C NH

70

G-groupsG-groupsN

1) On the Nitrogen can be attached only H or Ak2) The ring is isolated3) Any substitution allowed in the open sites

71

G-groupsG-groups

N

AkG1G1

1

G1 H,[@1]

72

G-groupsG-groups => fil reg => c6/ea and nrs=1 and nc=1 L1 1536728 C6/EA AND NRS=1 AND NC=1

=> Uploading C:\Program Files\stnexp\Queries\zanzola.strL2 STRUCTURE UPLOADED

73

G-groupsG-groups=> l2 sample subset=l1 PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE** L3 50 SEA SUB=L1 SSS SAM L2 => d scan

Me

Me

Me

C CH

O N CH2 C CH

NHS

O

O

74

G-groupsG-groups

How can you run the previous search, obtaining good results?

75

G-groupsG-groups

G1

N

H

Ak

N

Ak

Ak

N

H

H

1

2

3

G1 [@1],[@2],[@3]

76

G-groupsG-groups

=> Uploading C:\Program Files\stnexp\Queries\zanzolabis.strL4 STRUCTURE UPLOADED

=> l4 sample subset=l1PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE** L5 50 SEA SUB=L1 SSS SAM L4 => d scan

77

G-groupsG-groups

CNS CH2

O

OH2N

Pr-nNMe2

ClCH2

CN

F Cl

C

N OH

H2N

Me2N

S CH2

O

O

-

78

Repeating GroupsRepeating Groups

79

• Highlight all atoms in repeating group

• Select [ ] m-n[ ] m-n from the DrawDraw menu

• Enter the repeat values

• Use unspecifiedunspecified bond between atoms in repeating group to allow any bonding in the repetition

To specify a repeating group:

Repeating GroupsRepeating Groups

80

A repeating group may range from 0-200-20 only, however:

• may repeat a single atom or group of atoms

• may have multiple repeating groups

Repeating GroupsRepeating Groups

81

Repeating GroupsRepeating Groups

Search Question: Locate studies on carboxylic acids in milk that have the following structures:

CH3R'

O

OH Where R’ = an unsubstituted carbon chain of 10-40 atoms

with any type of bonding between the atoms

82

Challenges

R = 10-40 atom carbon chain with any type of bonding

Solution

Use [ ] m-n (repeating grouprepeating group)

Repeating GroupsRepeating Groups

83

A repeating group may range from 0-200-20 only, however:

• may repeat a single atom or group of atoms

--(C)0-10-1(C--C)5-205-20---

Repeating GroupsRepeating Groups

84

Repeating GroupsRepeating Groups

85

=> FILE REGISTRY=> Uploading acid.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

O

OH

5-20Me 0-1

Repeating GroupsRepeating Groups

86

D SCANL2 46 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN Chitosan, octadecanoate (salt) (9CI)MF C18 H36 O2 . x Unspecified CM 1 *** STRUCTURE DIAGRAM IS NOT AVAILABLE CM 2

(CH2)16 MeHO2C

Repeating GroupsRepeating Groups

87

L2 34 ANSWERS REGISTRY COPYRIGHT 1999 ACSIN 9-Dodecenoic acid (7CI, 8CI, 9CI)MF C12 H22 O2

(CH2)7 EtHO2C CH CH

Repeating GroupsRepeating Groups

88

Blocking SubstitutionBlocking Substitution

89

• Specify substitution with G-groups

• Use H

• Use Non-H Attachments (ConnectivityConnectivity)

• Exclude atoms

• CSS - closed substructure search

Blocking SubstitutionBlocking Substitution

90

Blocking SubstitutionBlocking Substitution

ConnectivityConnectivity

91

Blocking SubstitutionBlocking SubstitutionConnectivityConnectivity

92

Blocking SubstitutionBlocking SubstitutionConnectivityConnectivity

93

Search Question: Locate substances with the following general structure characteristics

N N

N

R'

Two Br'sR' = anything except an additional Bromine

The N substituent may be in a ring or a chainno further substitution

on this ring - substitution is allowed at all other open positions

Blocking SubstitutionBlocking Substitution

94

Challenges

R’ is any atom except Br

The N may be in a ring or chain

Solution

Use excludeexclude Br

Assign Ring/chainNode CharacteristicsNode Characteristics

Blocking SubstitutionBlocking Substitution

95

Challenges (cont)

Two Br, with no additional substitution on one ring.Substitution is allowed at other open sites.

Solution

1: Use G1=H/BrG1=H/Br on 4 positions and a SSS searchSSS search

2: Use a Variable Point of Variable Point of AttachmentAttachment and a CSS CSS searchsearch with VPA’s and Non-hydrogen AttachmentsNon-hydrogen Attachments

(Connectivity)(Connectivity)

Blocking SubstitutionBlocking Substitution

96

=> Uploading dye1.strL1 STRUCTURE UPLOADED

=> DL1 HAS NO ANSWERSL1 STR

N

N

Br

N

G1

G1

G1

G1

G1 H,Br

Approach 1: G1=H/Br

Blocking SubstitutionBlocking Substitution

97

Blocking SubstitutionBlocking Substitution

BUT

98

=> L1 SSS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L2 50 SEA SSS SAM L1

=> D SCANL2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 3-Pyrrolidinol, 1-[4-[(4-nitrophenyl)azo]phenyl]-, (R)- (9CI)MF C16 H16 N4 O3

NO2

HO

N

NN

R

All G1 values are H

Blocking SubstitutionBlocking Substitution

99

• Variables (Ak, Cb, G-groups)• VPA’s• Repeating groups• Variable bonds• Excluded atoms• Substitutions set at Non Hydrogen Atoms (Connectivity)

• Variables (Ak, Cb, G-groups)• VPA’s• Repeating groups• Variable bonds• Excluded atoms• Substitutions set at Non Hydrogen Atoms (Connectivity)

A CSS blocks substitution at Open SitesOpen Sites, but allows for:

Blocking SubstitutionBlocking Substitution

100

SSS CSSSSS CSS• G-groups• variable groups• VPA• REP groups• Exclusions• variable bonds

• Allows ANY substitution at any open positions

• yes• yes• yes• yes• yes• yes

• NO substitution at open positions (unless set by connectivity)

Blocking SubstitutionBlocking Substitution

101

Exclude an atom

• From the Draw menu, choose Atom or Variable• Select the desired Atom or Variable• Click EXCLUDE

Blocking SubstitutionBlocking Substitution

102

Allow substitution in CSS search

• Highlight the atom of interest (or click wright)

• Click QueryDef then Non-H Attachments

• Set to minimum of 1

Blocking SubstitutionBlocking Substitution

103

• Build ring system and variably attached fragment• Highlight the attachment atom• With shift key, highlight attachment points• Select VPA from Draw menu

Specify variable points of attachment to a ring system by adding a VPA:

Blocking SubstitutionBlocking Substitution

104

=> FILE REGISTRY=> Uploading dye2.strL3 STRUCTURE UPLOADED

=> D L3L3 HAS NO ANSWERSL3 STR

Approach 2: CSS, Non-hydrogen attachments and VPA

N

N

Br

Br

Br

N

Blocking SubstitutionBlocking Substitution

Conn. Min. 2 Conn. Min. 1 R/C

Conn. Min. 2

105

=> S L3 CSS FULLFULL SEARCH INITIATED 16:21:29FULL SCREEN SEARCH COMPLETED - 1303 TO ITERATE100.0% PROCESSED 1303 ITERATIONS 338 ANSWERS

L4 62 SEA CSS FUL L3

Blocking SubstitutionBlocking Substitution

Me NMe2

Br

Br

NN

N OO

106

Blocking SubstitutionBlocking Substitution

BUT

107

Blocking SubstitutionBlocking Substitution

N

N

NBr

Br

Br

HG1

1

2

G1 [@1],[@2]

Consider this structure.Same conditons as previous one

108

Blocking SubstitutionBlocking Substitution

=> Uploading C:\Program Files\stnexp\Queries\gbrexluded.strL5 STRUCTURE UPLOADED => l5 full cssL6 84 SEA CSS FUL L5 => l6 not l4L7 22 L6 NOT L4

Me

Me

Br

BrN N

H2N

109

Exclude an atom

• From the Draw menu, choose Atom or Variable• Select the desired Atom or Variable• Click EXCLUDE

Blocking SubstitutionBlocking Substitution

BUT if you exclude an atom or variable BUT if you exclude an atom or variable you exclude also Hydrogenyou exclude also Hydrogen

110

O

O

C

C

C

Conn. = E1 Max2

CN

C ClC H3

C

O

Ph

CF3

No Yes

No YesYes Yes

No No

No No

Skills Practice

Query

In Registry

111

C

C

C

C

C

C

Ak

If you wish Ak unsubstituted could you use the connectivity?If yes, which should be the value?

If you put the connectivity E=1 do you get also branched chains?If yes how can you isolate them?

112

C

C

C

C

C

C

Ak

If you wish Ak substituted only with a =O what should you do?

113

Connectivity E=2

C

C

C

C

C

C

Ak

O

114

Alkyl, Cycloalkyl not substitudednot substitudedOther Substitutions allowed

R1=H, Alkyl

R2= H, Alkyl, Cycloalkyl

A = N, Alkyl

Skills Practice

115

System LimitsSystem Limits

116

Structure Search Limits Scope of Search

Iterations Answers

Sample (online, subset, range)

2000 50

Full (online, subset, range)

1,000,000 1,000,000

Batch (online, subset)

1,500,000 1,500,000

System LimitsSystem Limits

117

No. of Substances Crossed to Search

CAS Files Non-CAS Files

REGISTRY file crossover

300,000 (CAPlus) 40,000 (CASREACT)

10,000

=> HELP SLIMIT

=> HELP CROSSOVER

System LimitsSystem Limits

118

Strategies when search limits are reached:

Query modification

• Structure drawing techniques

• ScreensSTN system-related options

• Batch searching

• Range searching

• Subset searching

System LimitsSystem Limits

119

=> S L1 SSS FULLFULL SEARCH INITIATED 15:30:15 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - >1,000,000 TO ITERATE15.3% PROCESSED 153323 ITERATIONS 1 ANSWERS40.0% PROCESSED 400000 ITERATIONS 31 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.01.07

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 419

L2 31 SEA L1 SSS FULL

System LimitsSystem Limits

120

Strategies when search limits are reached:

– Query modification

• Structure drawing techniques

System LimitsSystem Limits

121

• Ring isolation

• Changing bond values

• Additional substitution

System LimitsSystem Limits

Strategies when search limits are reached:

122

Locate references discussing the use of steroidal substances with the following structure as therapeutic agents. Structures must have O-substitution at the specified position. No C-O bond order is specified.

O

System LimitsSystem Limits

123

=> FILE REGISTRY=> Uploading steroid1.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

O

•Unspecified C-O bond

•Unspecified bonds for keto/enol

•Default isolated/embedded ring

•C-O bond is set to ring/chain

System LimitsSystem Limits

124

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 07:35:47 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 40288 TO ITERATE

2.5% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 793876 TO 817644PROJECTED ANSWERS: 101993 TO 110727

L2 50 SEA SSS SAM L1

System LimitsSystem Limits

125

Ring Isolation• No additional fused, bridged or spiro-fused

attachments

• Improves efficiency of first stage of the search - screening

• Reduces numbers of substances to iterate, and usually reduces number of answers

System LimitsSystem Limits

126

=> Uploading stlisol.strL9 STRUCTURE UPLOADED

=> D L9L9 HAS NO ANSWERSL9 STR

O

•Unspecified C-O bond

•Unspecified bonds for keto/enol

•C-O bond is set to ring/chain

•Ring is isolated

System LimitsSystem Limits

127

=> S L9 SSS SAMSAMPLE SEARCH INITIATED 07:51:14 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 17972 TO ITERATE

5.6% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 351444 TO 367436PROJECTED ANSWERS: 54995 TO 61463

L10 50 SEA SSS SAM L9

System LimitsSystem Limits

128

=> S L9 SSS FULLFULL SEARCH INITIATED 14:12:28 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 363218 TO ITERATE100.0% PROCESSED 363218 ITERATIONS 58645 ANSWERSSEARCH TIME: 00.00.11

L11 58645 SEA SSS FUL L9=> FILE CAPLUS

=> S L11/THU 59530 L2 368975 THU/RLL12 3082 L11/THU (L11 (L) THU/RL)

System LimitsSystem Limits

129

=> D SCANL12 3082 ANSWERS CAPLUS COPYRIGHT 2001 ACSIC ICM A61K-031/575CC 63-6 (Pharmaceuticals)TI Eye drop for the treatment of gray cataractST eye drop gray cataract bile extIT Bile Cataract

o o oIT 57-88-5, Cholesterol, biological studies 81-25-4, Cholic acid 149-91-7, Gallic acid, biological studies 635-65-4, Bilirubin, biological studies 25312-65-6, Cholanic acid RL: BAC (Biological activity or effector, except adverse); THU(Therapeutic use); BIOL (Biological study); USES (Uses)

System LimitsSystem Limits

130

Changing Bond Values

• Change from unspecified to a specific value

• Improves efficiency of first stage of the search - screening, reducing number of substances to be iterated

• Change ring/chain to chain OR ring also reduces number of substances to be iterated

System LimitsSystem Limits

131

=> Uploading stlnode.strL11 STRUCTURE UPLOADED

=> D L11L11 HAS NO ANSWERSL11 STR

•Single C-O bond

•Single ring bonds

•C-O bond is set to chain

•Ring is isolatedO

System LimitsSystem Limits

132

=> S L11 SSS SAMSAMPLE SEARCH INITIATED 07:52:33 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 15539 TO ITERATE

6.4% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 303339 TO 318221PROJECTED ANSWERS: 44325 TO 50151

L12 50 SEA SSS SAM L11

System LimitsSystem Limits

133

Additional Substitution

• Addition of NON-HYDROGEN substituents decreases number of substances to be iterated

• Provides a wider selection of screens

System LimitsSystem Limits

134

=> Uploading str1subs.strL13 STRUCTURE UPLOADED

=> D L13L13 HAS NO ANSWERSL13 STR

•Single C-O bond

•Single ring bonds

•C-O bond is set to chain

•Ring is isolated

•Another C-O bondO

O

System LimitsSystem Limits

135

=> S L13 SSS SAMSAMPLE SEARCH INITIATED 13:03:35 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 10290 TO ITERATE

9.7% PROCESSED 1000 ITERATIONS 11 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 199734 TO 211866PROJECTED ANSWERS: 1625 TO 2901

L14 11 SEA SSS SAM L13

System LimitsSystem Limits

136

• Techniques that DO NOT decrease number of substances iterated:– Blocking substitution with hydrogen– Adding stereochemistry– Adding atom attributes

• These techniques have no effect on number of substances to be iterated, but do typically reduce the number of answers

System LimitsSystem Limits

137

Strategies when search limits are reached:

– Query modification

• Screens (structure filters)

System LimitsSystem Limits

138

• Add additional “structure filters” or screens to the query

• Adds screens beyond those automatically generated by STN

• Narrows potential answer set

• Added during the structure SAVE process in STN Express

System LimitsSystem Limits

139

Locate all structures containing this phenolic structure fragment.

The benzylic C-O bond can be single or double. If single, the O can be part of a ring. The phenyl ring can be part of a larger ring system.

OH

O

System LimitsSystem Limits

140

• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Add relevant structure filters• Upload revised query• Run SAMPLE search, evaluate answers• Run a FULL file structure search

System LimitsSystem Limits

141

=> FILE REGISTRY=> Uploading phenol1.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

OH

O•Benzylic bond is unspecified

•O node is set to ring/chain

•Ring is isolated/embedded

System LimitsSystem Limits

142

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 16:11:32 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 32922 TO ITERATE

3.0% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 647671 TO 669209PROJECTED ANSWERS: 140410 TO 150620

L2 50 SEA SSS SAM L1

System LimitsSystem Limits

143

Step Action

1 Return to Structure Drawing

2 From FILE menu, click OPEN

3 From FILE menu, click SAVE

4 In SAVING dialog box, click checkmarkin “Refine Using Structure Filters”checkbox. Click SAVE

System LimitsSystem Limits

144

System LimitsSystem Limits

145

Step Action

5 From “Refine Using Structure Filters” dialog box select desired options

Structure characteristics for atoms with Hydrogens attached

Number of occurrences

Number of rings

Isotopes

Polymers

System LimitsSystem Limits

146

STN Express may suggest structure

filters

May select others from

this list

Or from this list

System LimitsSystem Limits

147

=> ....Testing the current file.... screenENTER SCREEN EXPRESSION OR (END):end

=> SCREEN 1700 AND 1943 AND 2005 AND 1838 L3 SCREEN CREATED

=> SCREEN 2043 L4 SCREEN CREATED

=> Uploading C:Filesfilter.strL5 STRUCTURE UPLOADED

=> QUE L5 AND L3 NOT L4L6 QUE L5 AND L3 NOT L4

Filters are converted to their “Screen” number

The Query command combines the structure

and screen terms

System LimitsSystem Limits

148

=> D L6

L6 HAS NO ANSWERSL3 SCR 1700 AND 1943 AND 2005 AND 1838L4 SCR 2043L5 STR

L6 QUE ABB=ON PLU=ON L5 AND L3 NOT L4

OH

O

System LimitsSystem Limits

149

=> S L6 SSS SAMSAMPLE SEARCH INITIATED 16:12:05 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 18039 TO ITERATE

5.5% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 352769 TO 368791PROJECTED ANSWERS: 131431 TO 141317

L7 50 SEA SSS SAM L5 AND L3 NOT L4

System LimitsSystem Limits

150

=> S L6 SSS FULLFULL SEARCH INITIATED 16:12:18 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 362253 TO ITERATE100.0% PROCESSED 362253 ITERATIONS 134146 ANSWERSSEARCH TIME: 00.00.06

L8 134146 SEA SSS FUL L5 AND L3 NOT L4

=> D SCAN L8 134146 ANSWERS REGISTRY COPYRIGHT 2001 ACSIN Benzoic acid, 5-chloro-2-hydroxy-, [3-[[4- (dimethylamino)phenyl]amino]-1- methyl-3-oxopropylidene]hydrazide (9CI)MF C19 H21 Cl N4 O3

Me NMe2

Cl

C NH

O

N C

OH

CH2 C NH

O

System LimitsSystem Limits

151

• Strategies when search limits are reached:– STN system-related options

• Batch searching

System LimitsSystem Limits

152

• Run overnight

• System limits are higher than for online searches

System LimitsSystem Limits

153

Locate patents discussing the use of organometallic substances containing the following structural fragment as polymerization catalysts.

R = Fe, Ti, Zr, Cr, Mg, Ni, Pd, As, Cu, Mo

R

System LimitsSystem Limits

154

• Enter structure-searchable file, upload query

• Run SAMPLE search, evaluate answers

• Run a FULL file structure search in BATCH mode

• Refine results

System LimitsSystem Limits

155

=> FILE REGISTRY=> Uploading metallo.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

•A G-group is used to define the metals

•Up to 20 options may be defined for a G-group

G1

G1 Fe,Ti,Zr,Cr,Mg,Ni,Pd,As,Cu,Mo

System LimitsSystem Limits

156

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 16:23:37 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 21936 TO ITERATE

4.6% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.02

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 429897 TO 447543PROJECTED ANSWERS: 127189 TO 136919

L2 50 SEA SSS SAM L1

System LimitsSystem Limits

157

=> D SCAN L2 50 ANSWERS REGISTRY COPYRIGHT 2001 ACSIN 1-Butanaminium, N-[(1'-bromoferrocenyl)methyl]-N,N-dimethyl-, bromide (9CI)MF C17 H25 Br Fe N . BrCI CCS Bu-n

Me

Me

Br

CH2 NC

Fe

CH

CH

CHCH

CCHCH

CH

CH

+-

2+

-

Br-

System LimitsSystem Limits

158

• Batch search is run overnight

• Results are saved in an answer set

• ACTIVATE the saved answer set for display or additional searching

• No additional cost for a batch search

System LimitsSystem Limits

159

=> BATCHENTER QUERY L# FOR BATCH REQUEST OR (END):L1ENTER BATCH REQUEST NAME OR (END):?Enter the name you wish to use for the BATCH request. The name must:

o o oENTER BATCH REQUEST NAME OR (END):METALLO/BMETALLO/BENTER TYPE OF SEARCH (SSS), CSS, FAMILY, OR EXACT:SSSENTER SCOPE OF SEARCH (FULL) OR RANGE:FULLQUERY L2 HAS BEEN SAVED AS BATCH REQUEST 'METALLO/B'

System LimitsSystem Limits

160

=> D SAVED/A

NAME CREATED NOTES/TITLE---------- ---------- --------------------------METALLO/AMETALLO/A 17 APR 2001 118512 ANSWERS IN FILE REG

=> FILE REGISTRY

=> ACTIVATE METALLO/A

L1 STRL2 118512 SEA FILE=REGISTRY SSS FUL L1

=> D SCAN

System LimitsSystem Limits

161

• Up to 300K REG answers may be crossed over to a CAS database

• The results of a structure search may be refined with additional substance criteria, e.g. – ELS - Element Symbol

System LimitsSystem Limits

162

=> S L2 AND (ZR OR TI)/ELS 109090 ZR/ELS 199331 TI/ELSL3 28115 L2 AND (ZR OR TI)/ELS

=> FILE CAPLUS

=> S L3(L)(CAT/RL OR CATAL?) AND POLY? AND PATENT/DT

L4 3510 L3(L)(CAT/RL OR CATAL?) AND POLY? AND PATENT/DT

This answer set may now be crossed into the

CAplus file

System LimitsSystem Limits

163

=> D 1 L5 BIB ABS HITSTR L5 ANSWER 1 OF 3510 CAPLUSAN 2001:235582 CAPLUSTI Metallocene polymerization catalysts for polyolefin preparationIN Yamamoto, Kazuhiro; Maruyama, Yasuo; Kanno, ToshihikoPA Nippon Polychemicals Co., Ltd., JapanSO Jpn. Kokai Tokkyo Koho, 13 pp. CODEN: JKXXAFDT PatentLA JapaneseFAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE ------------ ---- -------- --------------- -----PI JP2001089512 A2 20010403 1999JP-0266058 19990920

HITSTR shows the “hit” CAS RN’s for each answer.

System LimitsSystem Limits

164

IT INDEXING IN PROGRESSIT 37206-41-0, Bis(cyclopentadienyl)zirconium dibenzyl RL: CAT (Catalyst use); USES (Uses) (catalyst support; metallocene polymn. catalysts for polyolefin prepn.)RN 37206-41-0 CAPLUS CN Zirconium, bis(.eta.5-2,4-cyclopentadien-1-. . . . .

PhPh CH2ZrCH2

CH

CHCHCH

CH

CH

CHCHCH

CH

-4+-

-

-

System LimitsSystem Limits

165

• Strategies when search limits are reached– STN system-related options

• Range searching

System LimitsSystem Limits

166

• May be used to get the search to run to completion within ONLINE limits

• Use if the full-file projections is not too large AND projected answers are within limits

• Conduct an immediate search using two structure searches covering different segments of the file

System LimitsSystem Limits

167

What has been reported on the use of substances containing the following structural fragment as part of a catalyst or initiator in polymerization processes?

R1 = O, S, C

Additional substitution may be present. No fusion is allowed on the N-containing ring.

N N

R1

System LimitsSystem Limits

168

• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Run a FULL file structure search until it reaches

system limits and stops• Identify the oldest CAS RN in the partial answer

set• Run a range search over the remaining part of the

database• Combine answer sets• Refine results

System LimitsSystem Limits

169

=> FILE REGISTRY=> Uploading C:\Program Files\Stnexp\Queries\n2ring.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

•Use a G-group to define the options for R1

•Isolate the ring to prevent fusion

N N

G1

G1 O,S,C

System LimitsSystem Limits

170

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 21:38:35 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 22582 TO ITERATE

4.4% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 442690 TO 460590PROJECTED ANSWERS: 43190 TO 48944L2 50 SEA SSS SAM L1=> D SCAN

System LimitsSystem Limits

171

=> S L1 SSS FULLFULL SEARCH INITIATED 21:12:15 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 454927 TO ITERATE

87.9% PROCESSED 400000 ITERATIONS 36444 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.06

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 454927 TO 454927PROJECTED ANSWERS: 40838 TO 42058

L3 36444 SEA SSS FUL L1 Newest compounds are in L3

System LimitsSystem Limits

172

=> D 36444 RN L3 ANSWER 36444 OF 36444 REGISTRY COPYRIGHT 2001 ACSRN 54833-00-054833-00-0 REGISTRY

System LimitsSystem Limits

173

Type this To search the following

RAN=,xxx-xx-x From the oldest CAS RN up through xxx-xx-x

RAN=xxx-xx-x, From xxx-xx-x through the newest CAS RN

RAN=yyy-yy-y,xxx-xx-x From yyy-yy-y through xxx-xx-x

System LimitsSystem Limits

174

=> S L1 SSS RAN=,54833-00-054833-00-0

RANGE MORE THAN 100,000. WILL BE BILLED AS A FULL FILE SEARCH.INITIATE SEARCH? Y/(N):Y

RANGE SEARCH INITIATED 21:14:02 FILE 'REGISTRY' RANGE SCREEN SEARCH COMPLETED - 54955 TO ITERATE

100.0% PROCESSED 54955 ITERATIONS 10234 ANSWERSSEARCH TIME: 00.00.03

L4 10234 SEA RAN=(,54833-00-0) SSS L1

System LimitsSystem Limits

175

=> S L3 OR L4

L5 46677 L3 OR L4

=> FILE CAPLUS

=> S L5/CAT OR (L5 (L) ?POLYM? and (?CATAL? OR ?INITIAT?))

L6 310 L5/CAT OR (L5 (L) ?POLYM? AND (?CATAL? OR ?INITIAT?))

=> D SCAN

System LimitsSystem Limits

176

• Use RANGE= in SEARCH to specify portion of the database

• Use CAS RNs in REGISTRY to define the range

• If the range > 100,000 substances, a full-file search fee is charged. If =< 100,000 a lower fee is charged

To check RN versus time, use: => help rnyearhelp rnyear, => help rnweekhelp rnweek

System LimitsSystem Limits

177

• Strategies when search limits are reached:– STN system-related options

• Subset searching

System LimitsSystem Limits

178

Structure searches may be run against theentire REGISTRY file, or:• Against a subset of the database• The subset may be defined using;

– Substance terms– Subject terms

System LimitsSystem Limits

179

System LimitsSystem LimitsIterations

Compounds in Registry80 Milions

1000000

40 Milions20 Milions

Structure 1

Structure 2

Structure 1, angular coefficient is more generalStructure 2, angular coefficient is more specific

Subset

180

Term Field Example

Presence/absenceof an atom

/ELS P/ELS (P ispresent)

Atom counts n/elementsymbolm-n/elementsymbolelementsymbol>n

3/Cl

3-7/F

F>10

Ring systemelements

/REL N/REL

System LimitsSystem Limits

181

Term Field Example

Ring elementcounts

n elementsymbol/RELn-m elementsymbol/REL>n elementsymbol/REL

2/REL

2-3 N/REL

>2 N/REL

Ring atom count /RATC 6/RATC

System LimitsSystem Limits

182

Term Field Example

Ring elementalformula

/RELF C Fe/RELF

Ring identifier /RID 46.383/RID

System LimitsSystem Limits

183

Skills Practice

Search saturated, unsubstituted alkyl alcohols, containing 10-20 carbon atoms , with a chiral center, with reported a BP at 760 Torr.

184

Uploading C:\Program Files\stnexp\Queries\Alcohol.str

 chain nodes :1 2 chain bonds :1-2 exact/norm bonds :1-2 Connectivity :Connectivity :1:1 E exact RC ring/chain 2:1 E exact RC ring/chain1:1 E exact RC ring/chain 2:1 E exact RC ring/chain Match level :1:CLASS 2:CLASS Generic attributes :Generic attributes :1: 1: Saturation : Saturated Saturation : Saturated Number of Carbon Atoms : 7 or moreNumber of Carbon Atoms : 7 or more L1 STRUCTURE UPLOADED

185

 => l1SAMPLE SEARCH INITIATED 05:28:26SAMPLE SCREEN SEARCH COMPLETED - 931004 TO ITERATEFULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 0 L2 0 SEA SSS SAM L1 => c h o/elf(p)10-20/c(p)1/o and nc=1 and no rsd/faL3 22012 C H O/ELF(P)10-20/C(P)1/O AND . . . .

=> l1 full subset=l3L4 2003 SEA SUB=L3 SSS FUL L1

186

=> l4 and 760 torr/bp.p 12243468 760 TORR/BP.PL5 1694 L4 AND 760 TORR/BP.P => l5 and stereosearch/fs 5211009 STEREOSEARCH/FSL6 529 L5 AND STEREOSEARCH/FS

=> d qrd L6 ANSWER 1 OF 529 REGISTRY COPYRIGHT 2003 ACS on STN . . . . . . . . . .

(CH2)8 EtMe

OH

R

Calculated Properties Boiling Point (BP) CODE| VALUE | CONDITION | NOTE ====+==============+=================+=======BP |518.65+/-8.0 K|Press: 760.0 Torr|(1) ACD

187

Locate references discussing the photographic applications of compounds containing the following structural fragment.

The heterocyclic ring contains at least 2 N atoms, and at least 6 atoms in total. Bonds in the fragment are all chain bonds.

NHy

O

System LimitsSystem Limits

188

• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Define a subset• Create the subset• Run a SAMPLE structure search and evaluate results• Run a FULL file structure search• Refine results

System LimitsSystem Limits

189

=> FILE REGISTRY=> Uploading subset.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

•Hy with element count of Nitrogen, minimum 2 is usedN

Hy

O

System LimitsSystem Limits

190

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 21:49:03 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 123620 TO ITERATE

0.8% PROCESSED 1000 ITERATIONS 9 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 20251L2 9 SEA SSS SAM L1

System LimitsSystem Limits

191

=> ....Testing the current file.... screen

ENTER SCREEN EXPRESSION OR (END):end

=> SCREEN 1994 AND 2004 AND 1838

L3 SCREEN CREATED

=> SCREEN 2043

L4 SCREEN CREATED

•3 or more N

•1 or more O

•1 or more rings

•Polymers (2043) are excluded

System LimitsSystem Limits

192

=> Uploading C:Files.strL5 STRUCTURE UPLOADED=> QUE L5 AND L3 NOT L4L6 QUE L5 AND L3 NOT L4

=> S L6 SSS SAMSAMPLE SEARCH INITIATED 21:50:01 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 73719 TO ITERATE 1.4% PROCESSED 1000 ITERATIONS 23 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000

L7 23 SEA SSS SAM L5 AND L3 NOT L4

System LimitsSystem Limits

193

=> S >=2 N/REL (S) RATC>=6

15045119 REL.CNT >= 2 7536633 N/REL 3983611 >=2 N/REL (REL.CNT >= 2 (T) N/REL) 14208100 RATC>=6L8 30140633014063 >=2 N/REL (S) RATC>=6

•2 or more Nitrogen ring elements

•6 or more ring atoms

System LimitsSystem Limits

194

=> S L6 SUB=L8 SSS SAM

PROJECTIONS (WITHIN SPECIFIED SUBSET): ONLINE **COMPLETE**PROJECTED ITERATIONS (WITHIN SPECIFIED SUBSET): 333170 TO 348750PROJECTED ANSWERS (WITHIN SPECIFIED SUBSET): 8872 TO 11584

L9 30 SEA SUB=L8 SSS SAM L5 AND L3 NOT L4

System LimitsSystem Limits

195

=> S L6 SUB=L8 SSS FULL

L10 18836 SEA SUB=L8 SSS FUL L5 AND L3 NOT L4

Et2NF

ClCl

NH2

N

NNN

CH2CNH

O

NH

CH2

S

CH2O

O

System LimitsSystem Limits

196

=> FILE CAPLUS

=> S L10 AND PHOTOG?L11 118 L10 AND PHOTOG?

=> D SCANL11 118 ANSWERS CAPLUS COPYRIGHT 2001 ACSIC ICM C09B-023/00 ICS C07D-261/12; C07D-277/30; C07D-403/14; C07D- 413/14; C07D-417/14; G03C-001/14CC 41-6 (Dyes, Organic Pigments, Fluorescent Brighteners, and Photographic Sensitizers) Section cross-reference(s): 74TI Methine compounds for spectral sensitizers and silver halide photographic materials using the same o o o

System LimitsSystem Limits

197

Locate substances containing indole that have been patented for use as fabric dyes.

System LimitsSystem Limits

198

=> FILE REGISTRY=> Uploading C:\Program Files\Stnexp\Queries\indole.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

•Unlikely this will run within system limitsN

System LimitsSystem Limits

199

=> S L1 SSS SAMSAMPLE SEARCH INITIATED 13:50:22 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 74242 TO ITERATE

1.3% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.02

FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 407163

L2 50 SEA SSS SAM L1

System LimitsSystem Limits

200

=> FILE CAPLUS

=> S DYE# (L) FABRIC#

239878 DYE# 96285 FABRIC#L3 17681 DYE# (L) FABRIC#

=> S L3 AND PATENT/DT

3189643 PATENT/DTL4 11925 L3 AND PATENT/DT

System LimitsSystem Limits

201

=> FILE REGISTRY

=> TRANSFER

ENTER L# (L4) OR ?:L4ENTER ANSWER NUMBERS, RANGES (1-), OR ?:1-ENTER DISPLAY FIELDS (TI) OR ?:RN

L5 TRANSFER L4 1- RN : 30879 TERMS

L6 30670 L5

System LimitsSystem Limits

202

=> S L1 SUB=L6 SSS SAMSAMPLE SUBSET SEARCH INITIATED 14:00:21 FILE 'REGISTRY' SAMPLE SUBSET SCREEN SEARCH COMPLETED - 54 TO ITERATE

100.0% PROCESSED 54 ITERATIONS 8 ANSWERSSEARCH TIME: 00.00.01

PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE**PROJECTED ITERATIONS (WITHIN SPECIFIED SUBSET): 640 TO 1520PROJECTED ANSWERS (WITHIN SPECIFIED SUBSET): 8 TO 329

L7 8 SEA SUB=L6 SSS SAM L1

=> D SCAN O O O

System LimitsSystem Limits

203

=> S L1 SUB=L6 SSS FULL

FULL SUBSET SEARCH INITIATED 14:00:40 FILE 'REGISTRY' FULL SUBSET SCREEN SEARCH COMPLETED - 998 TO ITERATE

100.0% PROCESSED 998 ITERATIONS 110 ANSWERSSEARCH TIME: 00.00.01

L8 110 SEA SUB=L6 SSS FUL L1

System LimitsSystem Limits

204

=> FILE CAPLUS

=> S L8 (L) DYE# AND L4

32219 L8 239878 DYE# 301 L8 (L) DYE#L9 34 L8 (L) DYE# AND L4

=> D BIB HITSTR

34 patents contain an indole structure as a fabric dye

System LimitsSystem Limits

205

o o oIT 117584-16-4 RL: PRP (Properties); TEM (Technical or engineered material use); USES (Uses) (dye; ink-jet printing inks containing disperse dyes for printing fabrics with high colorfastness and high black color yield)RN 117584-16-4 CAPLUS CN 1H-Indole, 3-[(2,6-dichloro-4-nitrophenyl)azo]-1-methyl-2-phenyl- (9CI) (CA INDEX NAME)

Me

Ph

NO2Cl

Cl

N N

N

System LimitsSystem Limits

206

Me

CO2H

Me

Me

This group repeatedfrom 1 to 2.Not other substitutions allowed

Skills Practice

207

=> fil regL1 STRUCTURE UPLOADED => dL1 HAS NO ANSWERSL1 STR

   Structure attributes must be viewed using STN Express query preparation. 

Me

CO2H

Me

1-2

208

=> l1SAMPLE SEARCH INITIATED 06:58:14SAMPLE SCREEN SEARCH COMPLETED - 34814 TO ITERATE  2.9% PROCESSED 1000 ITERATIONS 0 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 685172 TO 707388PROJECTED ANSWERS: 0 TO 0 L2 0 SEA SSS SAM L1

209

““Filters” to Reduce IterationsFilters” to Reduce Iterations

Filters may be added during the save process

To add filters:

• Check “Refine Using Structure Filters”• Click SAVE

210

STN Express often suggests filters

• May add others using AND or NOT logic

““Filters” to Reduce IterationsFilters” to Reduce Iterations

211

212

213

=> nc=1 and no rsd/fa and 5-8/c and c h o/elf 40901379 NC=1 27196220 NO RSD/FA 1761030 5-8/C 2839176 C H O/ELFL3 42352 NC=1 AND NO RSD/FA AND 5-8/C AND C H O/ELF => l1 full subset=l3FULL SUBSET SEARCH INITIATED 07:01:14FULL SUBSET SCREEN SEARCH COMPLETED - 4319 TO ITERATE 100.0% PROCESSED 4319 ITERATIONS 70 ANSWERSSEARCH TIME: 00.00.01 L4 70 SEA SUB=L3 SSS FUL L1

Subset Searching using DictionarySubset Searching using Dictionary

214

Me

CMe2HO2C CHCCH

Me

CMe2HO2C E

Bu-i

Me

HO2C CHCH2

CH2 CHC CH3

CH3O

HO

215

Troubleshooting TipsTroubleshooting Tips

216

Structure Search MatchesStructure Search Matches

• An answer to a structure search must contain all the following from the query:

AtomsBondsConnections

• Relevant answers that may not be retrieved include:

Incompletely defined compoundsCompounds with unanticipated bonding patterns

217

• Structure searches return precise results

• Try MF searches when structure searches turn up no hits to look for:– Incompletely defined compounds– Different bonding patterns

Troubleshooting TipsTroubleshooting Tips

218

Incompletely Defined CompoundsIncompletely Defined Compounds

RN 26249-12-7 REGISTRYCN Benzene, dibromo- (8CI, 9CI) (CA INDEX NAME)OTHER NAMES:CN DibromobenzeneMF C6 H4 Br2CI IDS, COM

BrD1( )2

219

• Unknown point of attachment for one or more known substituent

• Unknown site of saturation/unsaturation for one or more bonds

• Unknown branching in specific carbon chains substituents

• Unknown site for esterification/etherification in polyacids or polyols

Incompletely Defined CompoundsIncompletely Defined Compounds

220

Queries:

IDS not retrieved:

BrD1( )2

Incompletely Defined CompoundsIncompletely Defined Compounds

221

Find the CAS RN for this compound.

Incompletely Defined CompoundsIncompletely Defined Compounds

222

=> FILE REGISTRY=> Uploading ids.strL1 STRUCTURE UPLOADED

=> D L1L1 HAS NO ANSWERSL1 STR

ClNH2 O N

ClNNHN ON N

ClN CH2CHMePh O S CH2CH2OSO3H

O

SO3H

SO3H

Incompletely Defined CompoundsIncompletely Defined Compounds

223

=> S L1 FAM SAML2 0...

=> S L1 FAM FULLL3 0...

Incompletely Defined CompoundsIncompletely Defined Compounds

224

To locate a CAS RN for an IDS

Conduct a MF search and refine by namefragments if needed

Incompletely Defined CompoundsIncompletely Defined Compounds

225

=> FILE REGISTRY

=> E C32H25Cl3N8O14S4

E1 1 C32H25CL3N8O/BIE2 1 C32H25CL3N8O13S4/BIE3 3 --> C32H25CL3N8O14S4/BI

ooo

=> S E3L1 3...

Incompletely Defined CompoundsIncompletely Defined Compounds

226

OSO3H

Me

Ph

Cl

Cl

ClNO

ON

H2N

NHN

N

N

N

CH2 CH

S CH2

O

O CH2

SO3HD12

=> D SCAN

=> S L1 AND 2(W)2L2 1

=> D L2 1 RN STR REFRN 163499-22-7 REGISTRY

227

• Alternating double and single bonds– Tautomers– Aromatic rings

• Many changed to “normalized” in REGISTRY

• STN Express changes query structures to allow for exact value or normalized

Bonding PatternsBonding Patterns

228

• Exceptions to “Normalized”assignment in REGISTRY

Enol-keto tautomers

Pyrazole and pyrazolium rings

1,2,4-Dithiazolium rings

1,2- and 1,3-dithiolium rings

Tropolone derivatives

Porphines

Phorbines

Phthalocyanines

Cyanine dyes

Bonding PatternsBonding Patterns

229

CA Index Name: 1H-Pyrazole, 3-chloro-5-methyl-

Author structure:

REGISTRY structure:

Bonding PatternsBonding Patterns

230

Locate references discussing the preparation of this pyrazole derivative

Bonding PatternsBonding Patterns

231

=> Uploading pyrazole.strL1 STRUCTURE UPLOADED

=> D L1

=> S L1 FAM SAML2 0…

=> S L1 FAM FULLL3 0...

Bonding PatternsBonding Patterns

232

Technique 1:Conduct a MF search and refine with name fragments

Technique 2:Build a structure replacing single/double bonds with unspecified bonds and run EXA or FAM search

Bonding PatternsBonding Patterns

233

=> FILE REGISTRY

=> E C4H5IN2

E1 2 C4H5IMG/BIE2 1 C4H5IMGNO4/BIE3 15 --> C4H5IN2/BIE4 8 C4H5IN2O/BIE5 4 C4H5IN2O2/BI

ooo=> S E3L1 15…

Bonding PatternsBonding Patterns

234

=> S L1 AND PYRAZ?L2 7…

=> D SCAN

IN 1H-Pyrazole, 3-iodo-5-methyl- (9CI)MF C4 H5 I N2

Me

I

NH

N

Compare bonding to query structure

Bonding PatternsBonding Patterns

235

=> S 1H-PYRAZOLE, 3-IODO-5-METHYL-/CNL3 1 ...

=> FILE CAPLUS

=> S L3/PREPL4 1…

Bonding PatternsBonding Patterns

236

=> Uploading pyrazole.strL1 STRUCTURE UPLOADED

=> D L1

=> S L1 FAM FULLL3 1...

N MeN

I

Bonding PatternsBonding Patterns

237

Many Structures SearchingMany Structures Searching

238

AND OR NOT

Many Structures SearchingMany Structures Searching

239

Structure queries with unattached fragments:

Same Structure Query (L1)

Different Structure Queries (L1 AND L2)

Many Structures SearchingMany Structures Searching

240

Same Structure Query (L1)

Fragments in the same structure componentFragments in the same structure component

No overlap of fragment atoms No overlap of fragment atoms

Many Structures SearchingMany Structures Searching

241

Different Structure Queries (L1 AND L2)

Fragments in the same structure component Fragments in the same structure component with no overlapwith no overlap

Fragments in the same structure component Fragments in the same structure component with overlapwith overlap

Fragments in different componentsFragments in different components

Many Structures SearchingMany Structures Searching

242

X

X

X

X

X

Searching these two fragments in the same L1

Skills Practice

243

NH

CF3

CF3

OH

Cl

ClO

Is this structure retrieved? Yes

244

N

N

Cl Cl

CF3

Is this structure retrieved? No

245

N

Ph

Me

CF3

Cl

Cl

COMP. 1

COMP. 2

Is this structure retrieved? No In the search L1 and L2 ? No

246

In the synth. lab. a substance has been isolated produced by bacteria.From the anal. dept. you got the following information:

FW = 1180-2000

The following fragments are inside the compound, in ring or chain:

Me

O

H

H

H

H

H

Me

O

O

H

N

HH

H H

H

H

Skills Practice

247

  chain nodes :1 4 5 6 7 ring/chain nodes :2 3 chain bonds :1-2 2-5 3-4 3-6 4-7 ring/chain bonds :2-3 exact/norm bonds :2-3 3-4 exact bonds :1-2 2-5 3-6 4-7

248

 chain nodes :3 4 ring/chain nodes :1 2 chain bonds :1-3 2-4 ring/chain bonds :1-2 exact/norm bonds :1-2 exact bonds :1-3 2-4

249

 chain nodes :1 6 7 ring/chain nodes :2 3 4 5 chain bonds :1-2 3-6 5-7 ring/chain bonds :2-3 3-4 4-5 exact/norm bonds :2-3 3-4 3-6 4-5 exact bonds :1-2 5-7

250

 chain nodes :3 4 5 6 7 8 9 ring/chain nodes :1 2 chain bonds :1-4 1-5 2-3 2-6 2-7 3-8 3-9 ring/chain bonds :1-2 exact/norm bonds :1-2 2-3 exact bonds :1-4 1-5 2-6 2-7 3-8 3-9

251

=> fil reg  => Uploading C:\Program Files\stnexp\Queries\fragments.str

 L1 STRUCTURE UPLOADED => dL1 HAS NO ANSWERSL1 STR * STRUCTURE DIAGRAM TOO LARGE FOR DISPLAY - AVAILABLE VIA OFFLINE PRINT * Structure attributes must be viewed using STN Express query preparation.

252

=> 1180-2000/fwL2 457267 1180-2000/FW => l1 full subset=l2FULL SUBSET SEARCH INITIATED 06:14:40FULL SUBSET SCREEN SEARCH COMPLETED - 100272 TO ITERATE 100.0% PROCESSED 100272 ITERATIONS 7 ANSWERSSEARCH TIME: 00.00.02 L3 7 SEA SUB=L2 SSS FUL L1

253

=> d 1-7

L3 ANSWER 2 OF 7 REGISTRY COPYRIGHT 2002 ACSRN 184490-65-1 REGISTRYCN .....OTHER NAMES:CN Desertomycin IFS STEREOSEARCHMF C61 H109 N O21SR CALC STN Files: CA, CAPLUS

254

Me

Me

HO

OH OH

HO

HO

PAGE 1-A

Me

Me Me Me

OH OH OH OH

HO

OO

PAGE 1-B

(CH2)3

Me

OHNH2

PAGE 1-C

Me

Me

O

O

OH

OH

OH

OH

OH

OH

S S

R

SS

PAGE 2-A

255

Substitution allowed

Substitution not allowed

Skills Practice

Me N

N

H

256

=> fil reg => Uploading C:\Program Files\stnexp\Queries\substitution.str chain nodes :1 2 3 4 5 6 7 chain bonds :1-2 2-3 2-4 3-6 3-7 4-5 exact/norm bonds :2-3 2-4 3-7 exact bonds :1-2 3-6 4-5  L1 STRUCTURE UPLOADED

257

=> c h n/elf and no rsd/fa and nc=1 and c<=10L2 15770 C H N/ELF AND NO RSD/FA AND NC=1 AND C<=10 => l1 full subset=l2L3 6161 SEA SUB=L2 SSS FUL L1 => d scan L3 61 ANSWERS REGISTRY COPYRIGHT 2003 ACSIN Ethanimidic acid, N-butyl-, 2,2-dimethylhydrazide (9CI)MF C8 H19 N3   

n-Bu Me

NMe2N C

NH

258

=> fil reg=> Uploading C:\Program Files\stnexp\Queries\substitution.str chain nodes :1 2 3 4 5 6 7 chain bonds :1-2 2-3 2-4 3-6 3-7 4-5 exact/norm bonds :2-3 2-4 3-7 exact bonds :1-2 3-6 4-5 Connectivity :4:1 E exact RC ring/chainL1 STRUCTURE UPLOADED

259

=> c h n/elf and no rsd/fa and nc=1 and c<=10L2 15770 C H N/ELF AND NO RSD/FA AND NC=1 AND C<=10 => l1 full subset=l2L3 2525 SEA SUB=L2 SSS FUL L1 => d scan L3 25 ANSWERS REGISTRY COPYRIGHT 2003 ACSIN Ethanimidamide, N-cyano- (9CI)MF C3 H5 N3CI COM  

MeNC NH C

NH

260

SMARTracker

261

SMARTracker

=> FILE REGISTRY  => STR 33069-62-4 :END L1 STRUCTURE CREATED  => S L1 FUL FULL SEARCH INITIATED 15:42:29 FULL SCREEN SEARCH COMPLETED - 2420 TO ITERATE 100.0% PROCESSED 2420 ITERATIONS 877 ANSWERS SEARCH TIME: 00.00.06 L2 877 SEA SSS FUL L1

262

SMARTracker

=> S L2/THU AND P/DT 2047 L2 146546 THU/RL 670 L2/THU (L2 (L) THU/RL) 2188789 P/DTL3 115 L2/THU AND P/DT

263

SMARTracker => SMART  SMARTracker INITIATED ENTER QUERY L# FOR SDI REQUEST OR (END):L3 ENTER UPDATE FIELD CODE (UP) OR ?:. ENTER SDI REQUEST NAME, (AA013/S), OR END:TAXOLS/S ENTER COST CENTER (NONE) OR NONE:. ENTER TYPE OF SEARCH (SSS), CSS, FAMILY, OR EXACT:. ENTER TITLE (NONE):. ENTER METHOD OF DELIVERY (OFFLINE), ONLINE, EMAIL, OR FAX:EMAIL ENTER EMAIL ID (1190C):[email protected] [email protected] RECEIVE DELIVERY NOTIFICATION? (Y)/N:N ELIMINATE PREVIOUSLY SEEN ANSWERS WITH EACH SDI RUN? Y/(N):Y ENTER PRINT FORMAT (BIB) OR ?:CBIB ABS HITSTR HIGHLIGHT HIT TERMS? (Y)/N:. ENTER MAXIMUM NUMBER OF HITS TO BE PRINTED PER RUN (100):. SORT SDI ANSWER SET (N)/Y?:N SEND SDI WITH NO ANSWERS? (Y)/N:N ENTER SDI RUN FREQUENCY: (WEEKLY), BIWEEKLY, OR ?:. ENTER SDI EXPIRATION DATE 'YYYYMMDD' OR (NONE):. QUERY 'L3' HAS BEEN SAVED AS SDI REQUEST 'TAXOLS/S'

Or => SDI XFILE