Not just another reaction database

49
Not just another Reaction Database Aileen Day 1 , Valery Tkachenko 1 , Alexey Pshenichnov 1 , Leah McEwen 2 , Simon Coles 3 , Richard Whitby 3 1 Data Science, Royal Society of Chemistry 2 Physical Sciences Library, Cornell University 3 Department of Chemistry, University of Southampton

Transcript of Not just another reaction database

Page 1: Not just another reaction database

Not just another Reaction Database

Aileen Day1, Valery Tkachenko1, Alexey Pshenichnov1, Leah McEwen2,

Simon Coles3, Richard Whitby3

1Data Science, Royal Society of Chemistry2Physical Sciences Library, Cornell University3Department of Chemistry, University of Southampton

Page 2: Not just another reaction database
Page 3: Not just another reaction database

RSC Archive – 480,000+ articles

Page 4: Not just another reaction database

Digitally Enabling RSC Archive

Page 5: Not just another reaction database

Article X-ray

Compounds

Reaction

Analytical Data

Text and References

Page 6: Not just another reaction database

The RSC data repository is under development, and is intended to contain chemical data which supports its publications.

A first version has been written which captures compounds, data sources and properties domains.

Reactions are next…

RSC data repositoryCompounds PropertiesData sources Reactions

Page 7: Not just another reaction database

• There are a lot of reactions databases already – many established with many reactions

• This reactions database aims to capture reactions:

• in sufficient detail for someone else to reproduce

• in analogous ways to those captured in Electronic Lab Notebook

• with fully recorded processes, parameters and equipment in S88 process recipe [1] style

• raw characterization data linked to products

• which gave low yields or unintended products

• multistep reactions

• to fully record all reaction products (not just the target product)

• Guided by the aims of Dial-a-Molecule

RSC data repository - reactionsD

eta

ilS

co

pe

Page 8: Not just another reaction database

Dial-a-Molecule aim

Page 9: Not just another reaction database

Dial-a-Molecule Roadmap

• To provide this

• In such a way that others can do this kind of analysis

• For these to be a potential source

Page 10: Not just another reaction database

RSC data repository – reactions domain

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

Parameters

Page 11: Not just another reaction database

• Reaction 1: Example of reaction text-mined

from RSC archive by NextMove with S88-

style procedure

• Reaction 2: Example From Will Dichtel’s

research group via Leah McEwen (ELN-style

reaction)

Reaction examples

Page 12: Not just another reaction database

Reaction 1: NextMove reaction text-mined

from RSC archive – original article

Page 13: Not just another reaction database

Reaction 1: NextMove reaction text-

mined from RSC archive – cml output<?xml version="1.0" encoding="UTF-8"?><reactionList xmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/" xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097"><reaction><dl:source><dl:documentId>c3ra45871g</dl:documentId><dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR (CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6 Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8, 65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText></dl:source><dl:reactionSmiles>[H-].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles><productList><product role="product"><molecule id="m0"><name dictRef="nameDict:unknown">10</name><dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved></molecule><amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount><amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount><amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount><amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount><identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/><identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/><dl:entityType>definiteReference</dl:entityType><dl:appearance>colourless</dl:appearance><dl:state>liquid</dl:state>

Page 14: Not just another reaction database

Reactions properties

[1] https://github.com/rsc-ontologies/rxno[2] Carey, Laffan, Thomson and Williams hierarchy: DOI: 10.1039/B602413K

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

ParametersReaction is defined by:• Reaction Smiles from textmining output• NextMove’s NameRXN program

categorises reaction by:• Named Reaction ontology ID and

name [1]• Reaction Class and name [2]

Page 15: Not just another reaction database

Reaction 1: ReactionReactions

• Reaction SMILES: [H-].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|

• ReactionClass: “9.7 Other functional group interconversion”• Other Named Reaction: “9.7.61 Ester hydrolysis”

From Nextmove’s namerxn reaction output (software source should be linked from Properties database)

As well as reaction SMILES we can store Reaction RXN, RD and ChemDraw files.

Page 16: Not just another reaction database

Reaction 1: Reaction referenceReference

• URL: http://dx.doi.org/10.1039/c3ra45871g• Title: "Diastereoselective vinylalumination for the synthesis of pericosine A, B and C"• Description: Reaction text-mined by NextMove from RSC article with DOI:

10.1039/c3ra45871g• Authors: Long-Shiang Li; Duen-Ren Hou• Publication Date: 31/10/2013• DOI: 10.1039/c3ra45871g• Journal: RSC Advances• Publication Type: Journal Article

Reference Details

• External Identifier: c3ra45871g: product 10• Paragraph Text: Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL,

3.23 mmol) was added dropwise …

Page 17: Not just another reaction database

RSC data repository – reaction components

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

ParametersReaction components define each reaction and each component is:• Defined as a

substance/compound/solution/mixture• Assigned a reaction role is stored which

can take values Reactant/ Product/ Solvent/ Catalyst/ Intermediate/ ChiralAuxiliary

Text-mining identifies all compounds and solutions (indicated by molarity) that play a role in each reaction and returns smiles, InChI, reaction role, and amounts of each.

Page 18: Not just another reaction database

Reaction 1: compounds and solutions

Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid.

Solutions

• Diisobutylaluminium hydride

Compounds

• 9• dichloromethane • methanol• citric acid

• water• Dichloromethane• Na2SO4• 10

Ignored for now (only the name was extracted in this pass) – in time “Substances”

• SiO2• EtOAc–hexanes

Reaction components: reactant , solvent, productOther compound/substance used in procedure

Page 19: Not just another reaction database

Reaction 1: Reaction Components

Ordinal ReactionRole Compound Solution

1 Reactant 1.1 M solute:Solvent:

2 Reactant

3 Solvent

4 Product

Page 20: Not just another reaction database

Reaction 1: Reaction renderingReaction

Solution: Diisobutylaluminium hydride• Components:

When you click on it

• Solution Role: Solute; Molarity: 1.1M; Compound: Diisobutylaluminium(1+) hydride:

• Solution Role: Solvent; Compound: cyclohexane

Page 21: Not just another reaction database

RSC data repository – reaction runs

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

ParametersWhile the reaction information defines the overall reaction, the details about each specific instance of performing the reaction are stored in reaction runs:• stoichiometry table of each component• labels of components• amounts of components• links to specific samples and sources• results and yields of products.

Page 22: Not just another reaction database

Reaction 1: Reaction RunReaction Run

• Label: Preparation of lithium acetylide (phenylethynyl)lithium; Experiment Stage: Executed• Stoichiometry Table Rows

Label Reaction Component

Volume (mL)

Mass (mg)

Moles (mMol)

Percentage Yield (%)

Substance State

Diisobutylaluminium hydride

Reactant: 2.93 3.23 Liquid

9 Reactant 500 1.29 Solid

dichloromethane Solvent 20 Liquid

10 Product 308 1.02 79 Solid

Page 23: Not just another reaction database

RSC data repository – procedure

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

ParametersFor reactions to be fully reproducible and queryable they are captured in a way analagous to S88 process recipes [1]:1. Break process down into a series of steps

(actions)2. Define parameters at any level (for whole

experiment or for particular action)3. Define equipment at any level (for whole

experiment or for particular action)

[1] https://en.wikipedia.org/wiki/ISA-88

Page 24: Not just another reaction database

S88-style procedures

Type of actions which can be assigned to procedure

stepsAction Types

Add Synthesize Wait Degass

Yield Wash Unknown Irradiate

Stir Extract Precipitate Mill

Remove Filter Partition Sample

Heat Concentrate Quench Reflux

Dry Cool Apparatus Action Transfer

Purify Dissolve Recover

Page 25: Not just another reaction database

S88-style proceduresParameters that can be assigned to actions or

experiments

rate

speed pH

time

pressure

particle size

volume

weight

quantity

temperaturesample ID

Substance Parameters Other Parameters

Can be time

dependent

Page 26: Not just another reaction database

Reaction 1: procedure steps

Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid.

Text mining breaks down procedure summary into steps:<dl:reactionActionList/dl:reactionActions> dl:phraseTexts• action="Add“: Diisobutylaluminium hydride (1.1 M in

cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C

• action=" Stir“: The reaction mixture was stirred at −78 °C for another 2 h

• action="Heat“: warmed up to rt• action="Quench“: quenched with methanol (3 mL) and citric

acid(aq) (w/w, 10%, 5 mL)• action="Concentrate“: concentrated• action="Add“: The residue was added with water (10 mL)• action="Extract“: extracted with dichloromethane (12 mL × 3)• action="Dry“: dried over Na2SO4• action="Filter“: filtered• action="Concentrate“: concentrated• action="Purify“: The crude product was further purified by

column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33)• action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a

colourless liquid

Page 27: Not just another reaction database

Reaction 1: Example Reaction Step 1

Procedure Step• Ordinal:1; Title: Add; Experiment Stage: Executed• Description: Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was

added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C

• Type: “Add”• Parameters:

• Substance: Stoichiometry Table Row for Diisobutylaluminium hydride• Substance: Stoichiometry Table Row for 9• Substance: Stoichiometry Table Rowfor dichloromethane• Temperature:

• Value: -78C

<dl:reactionAction action="Add"><dl:phraseText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the

solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C</dl:phraseText> <dl:chemical ref="m1"/> <dl:chemical ref="m2"/><dl:chemical ref="m3"/><dl:parameter propertyType="Temperature" normalizedValue="-78">-78 °C.</dl:parameter>

</dl:reactionAction>

Underlined values are retrieved from elsewhere in the repository (so that if e.g. amounts are updated, changes can be made in one place and be picked up

Page 28: Not just another reaction database

Reaction 1: Example Reaction Step 2<dl:reactionAction action="Stir">

<dl:phraseText>The reaction mixture was stirred at −78 °C for another 2 h</dl:phraseText> <dl:parameter propertyType="Time" normalizedValue="7200">2 h</dl:parameter>

<dl:parameter propertyType="Temperature" normalizedValue="-78">-78 °C</dl:parameter></dl:reactionAction>

Procedure Step• Ordinal:2; Title: Stir; Experiment Stage: Executed• Description: The reaction mixture was stirred at −78 °C for another 2 h• Type: “Stir”• Parameters:

• Temperature:• Value: -78C• Time: 2 hours

Page 29: Not just another reaction database

Reaction 1: Example Reaction Step 3<dl:reactionAction action="Quench">

<dl:phraseText>quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL)</dl:phraseText><chemical><molecule id="m4"> <name dictRef="nameDict:unknown">methanol</name></molecule>

<amount dl:propertyType="VOLUME" dl:normalizedValue="0.003">3 mL</amount> <identifier dictRef="cml:smiles" value="CO"/> <identifier dictRef="cml:inchi" value="InChI=1S/CH4O/c1-2/h2H,1H3"/> <dl:entityType>exact</dl:entityType>

</chemical><chemical><molecule id="m5"> <name dictRef="nameDict:unknown">citric acid</name></molecule> ….</chemical>

</dl:reactionAction>

Procedure Step• Ordinal:3; Title: Quench; Experiment Stage: Executed• Description: quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL)• Type: “Quench”• Parameters:

• Substance: • Label: methanol• Compound: • Volume: 0.003 L

• Substance:• Label: citric acid• Compound: • Volume: 0.005 L

Page 30: Not just another reaction database

Reaction 2: ELN-style reactionExample reaction from Cornell (Will Dichtel’s research group, via Leah McEwen):

• Multiple “runs” of a reaction are performed, with different amounts, and under different conditions

• Results, observations and product characterisations are stored for each

• This allows the run which gives rise to the best yield to be identified

• Currently the experient files are stored in a number of files (see below), but this information is suitable to be stored in an Electronic Lab Notebook:

– SJH-01-227_Enotebook.docx (“notebook” which shows the details of a particular run of a reaction – stoichiometry table (embedded Excel spreadsheet which does calculations), actual quantities, notes of conditions and results and TLC images embedded

– WeeklyReport_5_01_2014.docx (logs all runs of all reactions done during a particular week – grouped by reaction, with reaction schema and observations noted)

– spectra files

Page 31: Not just another reaction database

SJH-01-227_Enotebook.docxSJH-01-227_Enotebook.docx

Actual quantitiesSJH-01-223 0.1009 gBenzaldehyde 0.0554 gCu(OTf)2 0.0055 gTFA 0.030 mL

EQ FW MMOL g d mL Reagent

1 756.95 0.132 0.100 SJH-01-223

6 206.24 0.793 0.163 Benzaldehyde

0.1 361.67 0.013 0.005 Cu(Otf)2

3 114.02 0.396 0.045 1.49 0.030 TFA

Conc in line 1 (M): 0.100 1.321 DCE

1 1063.36 0.132 0.140 Theoretical Yield

SJH-01-227 11/4/2014

SJH-01-223 and Cu(OTf)2 was transferred to a 5mL RBF with

a reflux condenser with a schlenk adaptor and put under a N2

environment. The benzaldehyde was dissolved in

dichloroethane and this solution was added via syringe to the

RBF reaction flask. The flask was then placed in a 100°C oil

bath and TFA was added via Hamilton microsyringe. The

reaction stirred or 30 min.

When complete, the reaction was washed with sat.

NaHCO3(aq) and extracted three times with DCM. The

organic fractions were collected and dried with MgSO4,

filtered and solvent was removed under vacuum. The

product was purified on SiO2 column chromatography (3:7

DCM:Hexanes).

The product was isolated as a light yellow solid

0.0963 g (68% Yield).

Reaction run -stoichiometry table

Procedure, parameters, substance parameters, equipment

Procedure-results

Reaction database

Procedure - results

Page 32: Not just another reaction database

WeeklyReport_5_01_2014.docxDate NB Page Type Comments

12/7/2013 SJH-01-211 0.015g Did a prep plate purification an isolated ~0.002g from my top band.

HNMR is tricky, not sure if I made it. Did not see anything on MALDI

(graphite, no matrix, or 2,5 dihydroxy benzoic acid). GCMS shows a

peak, retention time 13.01 min with m/z = 202. I don’t know what this

mass equates to.

1/11/2014 SJH-01-227 0.100 Isolated 0.0963g. ASAP does not show significant surface area (20ish)

Flourescence does not change much but UV absorption does blue shift

after benzannulation.

2/12/2014 SJH-01-227 2D high temp NMR is much more simplified than previous 2D NMRs. I

haven’t yet gotten a chance to look through them and process the

spectra. The 13C looks significantly simplified as well with 24 signals.

Ivan has a partial assignment finished and we think we’ve figured out

where the proton on the central benzene ring is. For a more complete

assignment he said he or Tony would help me set up a band specific

HMBC and HSQC to help solve some of the ambiguities in the NMR. He

was using the low temp NMR to solve.

2/26/2014 Set up a band specific HMBC and HSQC for this with the help of Tony

last night. The HMBC does not have good sensitivity for some reason.

Tony is going to talk to Ivan about this and we should be able to get it

next week.

3/26/2014 SJH-01-298 0.150 Was going to run this reaction last night but I opened the flask under

vacuum instead of nitrogen and SM got sucked up into the hose. I

extracted out the compound best I could. I’ll need to repurify but I

should be able to do this reaction today.

3/26/2014 SJH-01-298 0.1417g Isolated 0.1235g of final product. 62%Y. Confirmed by MALDI and NMR

Working on Structural assignments with Ivan.

Experiment observations mostly – stored in Procedure results

Page 33: Not just another reaction database

• Files that would probably go into spectra bucket of data repository:

– SJH-01-227.jdx or SJH-01-227_jcamp.jdx (IR spectrum files - same content)

– SJH-01-227_22-145C.jdx (1H NMR spectrum)

– SJH-01-227-RT-2D.jdx (2D 1H NMR spectrum)

• Other files which might be processed (to extract e.g. store peak assignment values into the data repository so that they can be exported):

– SJH-01-227_DCM_rsw.rsw or SJH-01-227_DCM_rtf.rtf (UV-VIS-NIR peaks in text file – nearly the same as each other)

• Other files (we think duplicates of the above):

– SJH-01-227.spa (binary file)

– SJH-01-227_csv.csv (text, but with no headers)

– SJH-01-227_grams.spc (binary file)

– SJH-01-227_mattson.ras (binary file)

– SJH-01-227_nicolet.nic (binary file)

– SJH-01-227_pcir.ird (binary file)

– SJH-01-227_spa.spa (binary file)

– SJH-01-227_spectacle.irs (binary file)

– SJH-01-227_tiff.tiff and SJH-01-227_wmf.wmf (image files of the same spectrum)

– SJH-01-227_DCM_baseline.csw (UV-VIS-NIR, binary file)

– SJH-01-227_DCM_bsw.bsw (UV-VIS-NIR spectrum, binary file)

– SJH-01-227_DCM_csv.csv (might be able to do something with this – UV?)

– SJH-01-227_DCM_dsw.dsw (UV-VIS-NIR spectrum, binary file)

– SJH-01-227_DCM_grams.spc (UV-VIS-NIR spectrum, binary file)

– SJH-01-227_DCM_gsw.gsw (UV-VIS-NIR spectrum, binary file)

– SJH-01-227_DCM_msw.msw (UV-VIS-NIR spectrum, binary file)

Other spectra files Spectra database ultimately (but Procedure Results Files for now)

Procedure Results files

Use this as an interim example

Page 34: Not just another reaction database

ESI docx example – synthetic procedureSynthesis of 17: 16 (0.101 g, 0.132 mmol) and Cu(OTf)2 (0.006 g, 0.01 mmol) were added to a round-bottom flask under a N2 atmosphere. In a separate vial, 2 (0.155 g, 0.753 mmol) was dissolved in C2H4Cl2 (1.3 mL) and transferred to the reaction flask. CF3CO2H (0.030 mL, 3 equiv) was added to the reaction mixture, which was refluxed at 100 °C for 1 h. The reaction mixture was washed with saturated NaHCO3 (15 mL) and extracted with C2H4Cl2 (3 x 5 mL). The organic fractions were collected, dried (MgSO4), and filtered to give a dark red solution. The solvent was removed, and the product was purified by column chromatography (SiO2, 30:70 CH2Cl2 : hexane) to yield 17 as a pale yellow powder (0.096 g, 68% yield). 17: 1H NMR (500 MHz, CDCl3): δ 8.15 (d, 2H), 8.13 (s, 1H,), 7.98 (s, 1H), 7.95 (s, 2H), 7.92 (d,2H), 7.88 (d, 1H), 7.87 (d, 1H), 7.84 (d, 1H), 7.80 (s, 1H), 7.69 (t, 2H), 7.64 (d, 2H), 7.57 (t, 2H), 7.56 (s, 2H), 7.54 (s, 2H), 7.54 (d, 2H), 7.45 (t, 1H), 7.44 (t, 2H), 7.40 (t, 2H), 7.39 (t,1H), 7.38 (t, 1H), 7.34 (t, 1H), 6.88 (t, 4H), 6.88 (t, 2H), 6.80 (s, 2H), 6.77 (d, 4H), 6.70 (d, 1H), 6.50 (t, 1H), 6.39 (d, 2H), 6.24 (t, 2H), 6.22 (s, 1H), 6.11 (s, 2H), 6.04 (s, 1H). 13C NMR (125 MHz, CDCl3) δ 141.47, 141.10, 140.85, 140.42, 140.32, 140.20, 140.10, 139.60, 139.45, 139.37, 139.16, 139.03, 138.72, 138.28, 138.07, 133.28, 133.04, 132.96, 132.90, 132.64, 132.37, 131.60, 131.41, 131.19, 131.17, 130.72, 130.48, 130.28, 129.87, 129.85, 129.57, 129.30, 129.16, 129.11, 128.35, 128.21, 128.08, 128.04, 127.86, 127.72, 127.47, 126.85, 126.65, 126.50, 126.32, 126.25, 126.17, 126.08, 125.98, 125.84. IR (solid, ATR) 3051, 2925, 2131, 1947, 1590, 1488, 1444, 1415, 1318, 1274, 1180, 1133, 1074, 1018, 950, 882, 870, 809, 771, 743, 720, 697 cm-1. HRMS (DART) calcd for [C84H56

+] 1064.4376, found 1064.4348.

Reaction runs database - stoichiometry table, reaction results and procedure – S88

Page 35: Not just another reaction database

Reaction 2: Reaction, ReactionFile and ReferenceReaction

• ReactionFile: SJH-01-227.cdx• FileType: ReactionFileType.CDX• ReactionSMILES:

C1(C#CC2=C(C3=CC=CC=C3)C=C(C=CC=C4)C4=C2)=CC(C#CC5=C(C6=CC=CC=C6)C=C(C=CC=C7)C7=C5)=CC(C#CC8=C(C9=CC=CC=C9)C=C(C=CC=C%10)C%10=C8)=C1.O=CC1=CC=CC=C1C#CC2=CC=CC=C2>[O-]S(=O)(C(F)(F)F)=O.[O-]S(=O)(C(F)(F)F)=O.[Cu+2].OC(C(F)(F)F)=O.ClCCCl>C%11(C%12=C(C=C(C=CC=C%13)C%13=C%12)C%14=C(C%15=CC=CC=C%15)C=C(C=CC=C%16)C%16=C%14)=CC(C%17=C(C=C(C=CC=C%18)C%18=C%17)C%19=C(C%20=CC=CC=C%20)C=C(C=CC=C%21)C%21=C%19)=CC(C%22=C(C=C(C=CC=C%23)C%23=C%22)C%24=C(C%25=CC=CC=C%25)C=C(C=CC=C%26)C%26=C%24)=C%11|f:3.4.5|

Limiting Reactant

compound

Reactant compound

Reactant compound

Solvent compound

Solvent compound

Product compound

• Components:

Page 36: Not just another reaction database

Reaction 2: Reference

• ELN: Reaction SJH-01-227• Authors: Sam Hein; William R. Dichtel; Leah McEwen• URL: http://www.eln.com/cornell/dichtel/SJH-01-227• Publication date: 12th February 2014• PublicationType: PublicationType.ELN• Reference Details: Reaction SJH-01-227

Reference

Page 37: Not just another reaction database

Reaction 2: Planned reaction runReaction

• Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned• Stoichiometry Table:

Label Reaction Component Substance Amounts Comments

SJH-01-223 Role: Limiting ReactantCompound

Molecular Mass: 756.95

State: Solid Equivalence: 1Moles: 0.132 mMolMass: 0.1 g

benzaldehyde Role: ReactantCompound

Molecular Mass: 206.24

State: Solid Equivalence: 6Moles: 0.293 mMolMass: 0.163 g

Cu(OTf)2 Role: ReactantCompound

Molecular Mass: 361.67

State: LiquidPurity: 98%Source: 283673-5G, Sigma Aldrich

Equivalence: 0.1Moles: 0.013 mMolesMass: 0.005 g

DCE Role: SolventCompound

State: LiquidPurity: 99-100%Source: 283673-5G, Sigma Aldrich

Volume: 1.321 mL Concentration in line 1: 0.1 M

TFA Role: SolventCompound

Molecular Mass: 114.02Density: 1.49 g/ml

State: LiquidPurity: 99%Source: T6508-500mL, Sigma Aldrich

Equivalence: 3Moles: 0.396 mMolMass: 0.045 gVolume: 0.030 mL

SJH_01_227 Role: ProductCompound

Molecular Mass: 1063.36

State: Solid Equivalence: 1Moles: 0.132 mMolMass: 0.140 g

Page 38: Not just another reaction database

S88 process standard approach

ProcessProcess

StageProcess

StageProcess

Stage

Process

OperationProcess

Actions

Experiment Synthesis stage Preparation / Reaction / Work up / Isolation

Heat / Cool / Dose / Stir etc.

S88 allows procedure steps (process actions) to be grouped

into “process operations”:

We allow “Procedure Steps” to be nested and have seeded the following

procedure step types to assign to procedure steps for these parent operations:

S88 process operation/Procedure. StepTypes.Title

Preparation

Reaction

S88 process operation/Procedure. StepTypes.Title

Work up

Isolation

Page 39: Not just another reaction database

Reaction 2: Planned procedureProcedure

• Title: Reaction SJH-01-227 dated 2/12/2014; Failed Reaction: false; Experiment Stage: Planned; Link to ReactionRun• Procedure Steps:

Ordinal Parent Title Description ParameterSubstances Parameter Equipment

1 Reaction

2 Reaction Add Add SJH-01-223 (0.1 g, 0.132 mmol)to a 5 mL round bottom flask with a reflux condenser with a schlenkadaptor

• SJH-01-223 stoichiometry table row

• round bottom flask• Volume=5mL• Type=Apparatus

• reflux condenser• schlenk adaptor

3 Reaction Add Add Cu(OTf)2 (0.005 g, 0.013 mmol)and put under a N2 environment

• Cu(OTf)2 stoichiometrytable row

• N2 environment

4 Reaction Dissolve Dissolve the benzaldehyde (0.163 g, 0.791 mmol) in DCE (1.3 mL).in a vial

• Benzaldehydestoichiometry table row

• DCE stoichiometry table

• vial

5 Reaction Transfer Transfer this solution via syringe to the reaction round bottom flask

• syringe• round bottom

flask

Page 40: Not just another reaction database

If there are differences between the planned and executed reaction or procedure then both versions of the following can be stored and flagged as having an ExperimentStage field as Planned or Executed:

• Reaction run

• All corresponding stoichiometry table rows

• Procedure and for each

• All corresponding Procedure Steps and ParameterValues and ParameterTimes

• Results and requested user inputs can be recorded and linked the relevant procedure or step of the Executed Procedure

Reaction/Procedure Planned and

Executed Experiment Stage

Page 41: Not just another reaction database

Reaction 2: Reaction run (Executed and Planned)

Reaction• Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned• Stoichiometry Table

Reaction• Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Executed; Link to Planned

reaction run• Stoichiometry Table

By default, the executed version is shown, but the planned version can be accessed via clicking on a link

Links to actual amounts of reactants/reagents used

Links to planned amounts of reactants/reagents used

Page 42: Not just another reaction database

Label Reaction Component Actual Amounts (Planned values) State Comments

SJH-01-223 Role: Limiting ReactantCompound

Molecular Mass: 756.95

Mass: 0.1009 gMoles: 0.133 mMolEquivalence: 1

Solid

benzaldehyde Role: ReactantCompound

Molecular Mass: 206.24

Mass: 0.0554 gMoles: 0.269 mMolEquivalence: 2.02

Solid

Cu(OTf)2 Role: ReactantCompound

Molecular Mass: 361.67

Mass: 0.0055 gMoles: 0.015 mMolesEquivalence: 0.11

Solid

DCE Role: SolventCompound

Volume: 1.321 mL Liquid Concentration in line 1: 0.1 M

TFA Role: SolventCompound

Molecular Mass: 114.02Density: 1.49 g/ml

Volume: 0.030 mLMass: 0.045 gMoles: 0.396 mMolEquivalence: 2.97

Liquid

SJH_01_227 Role: ProductCompound

Molecular Mass: 1063.36

Mass: 0.0963 g; Moles: 0.0906 mMolEquivalence: 0.679 (planned: 1)Yield: 67.9%

Solid

Click to see added sample information (see next slide)

Label Reaction Component Planned Amounts State Comments

SJH-01-223 Role: Limiting ReactantCompound

Molecular Mass: 756.95

Equivalence: 1Moles: 0.132 mMolMass: 0.1 g

Solid

benzaldehyde Role: ReactantCompound

Molecular Mass: 206.24

Equivalence: 6Moles: 0.293 mMolMass: 0.163 g

Solid

Cu(OTf)2 Role: ReactantCompound

Molecular Mass: 361.67

Equivalence: 0.1Moles: 0.013 mMolesMass: 0.005 g

Solid

DCE Role: SolventCompound

Volume: 1.321 mL Liquid Concentration in line 1: 0.1 M

TFA Role: SolventCompound

Molecular Mass: 114.02Density: 1.49 g/ml

Equivalence: 3Moles: 0.396 mMolMass: 0.045 gVolume: 0.030 mL

Liquid

SJH_01_227 Role: ProductCompound

Molecular Mass: 1063.36

Equivalence: 1Moles: 0.132 mMolMass: 0.140 g

Solid

Reaction 2: Stroichiometry table (Executed and Planned)

Page 43: Not just another reaction database

Reaction 2: Sample information of product (for executed version)

Sample

• Label: SJH_01_227• OriginalDateAcquired: 17:00:00 02/12/2014• SubstanceState = Solid• SampleAmounts:

• Mass: 0.0963 g at TimeStamp: 17:00:00 02/12/2014

• SubstanceSource:• Reaction:

• Reaction Run: Reaction SJH-01-227 dated 2/12/2014

• Stoichiometry Table Row Product : SJH_01_227

• Compound:

• Characterisations:• Appearance: “light yellow solid” at

DateMeasured: TimeStamp: 17:00:00 02/12/2014

Page 44: Not just another reaction database

Reaction 2: Procedure (planned and executed values)

Procedure• Title: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned; Link to ReactionRun• Procedure Steps

Procedure• Title: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Executed; Link to Planned Procedure;

Link to ReactionRun• Procedure Steps

Links to planned ReactionRun and Procedure Steps

Links to executed ReactionRun and Procedure Steps

Page 45: Not just another reaction database

Ordinal Parent Title Description ParameterSubstances Parameter Equipment

1 Reaction

2 Reaction Add Add SJH-01-223 (0.1 g, 0.132 mmol)to a 5 mL round bottom flask with a reflux condenser with a schlenkadaptor

• SJH-01-223 stoichiometry table row

• round bottom flask• Volume=5mL• Type=Apparatus

• reflux condenser• schlenk adaptor

3 Reaction Add Add Cu(OTf)2 (0.005 g, 0.013 mmol)and put under a N2 environment

• Cu(OTf)2 stoichiometrytable row

• N2 environment

4 Reaction Dissolve Dissolve the benzaldehyde (0.163 g, 0.791 mmol) in DCE (1.3 mL).in a vial

• Benzaldehydestoichiometry table row

• DCE stoichiometry table

• vial

5 Reaction Transfer Transfer this solution via syringe to the reaction round bottom flask

• syringe• round bottom

flask

Reaction 2: Procedure Steps (Executed version)

Ordinal Parent Title Executed Description ParameterSubstances Parameter Equipment

1 Reaction

2 Reaction Add Add SJH-01-223 (0.101 g, 0.133 mmol) to a 5 mL round bottom flaskwith a reflux condenser with a schlenk adaptor

• SJH-01-223 stoichiometry table row

• round bottom flask• Volume=5mL• Type=Apparatus

• reflux condenser• schlenk adaptor

3 Reaction Add Add Cu(OTf)2 (0.006 g, 0.015 mmol)and put under a N2 environment

• Cu(OTf)2 stoichiometrytable row

• N2 environment

4 Reaction Dissolve Dissolve the benzaldehyde (0.155 g, 0.790 mmol) in DCE (1.3 mL).in a vial

• Benzaldehydestoichiometry table row

• DCE stoichiometry table

• vial

5 Reaction Transfer Transfer this solution via syringe to the reaction round bottom flask

• syringe• round bottom

flask

All values that are retrieved from stoichiometry table rows are automatically updated with Executed rather than Planned values

Page 46: Not just another reaction database

• We have shown how this reactions database captures reactions:

• in sufficient detail for someone else to reproduce

• in analogous ways to those captured in Electronic Lab Notebook

• with fully recorded processes, parameters and equipment in S88 process recipe [1] style

• raw characterization data linked to products

• which gave low yields or unintended products

• multistep reactions

• to fully record all reaction products (not just the target product)

Conclusions

Page 47: Not just another reaction database

Because of all this being captured and linked…

Reactions Substances Procedures Equipment

Compounds

Mixtures

Solutions

Samples

Reaction runs Steps

Parameters

Page 48: Not just another reaction database

• We have shown 2 examples:

• Reaction 1: Example of reaction text-mined from RSC archive by NextMove with S88-style procedure• there are 31,000 more of these to be validated and

imported

• Reaction 2: Example From Will Dichtel’s research group via Leah McEwen (ELN-style reaction)• Consider pipeline for population direct from ELNs

• Develop reactions user interface, API, and import/validation platform

Future work

Page 49: Not just another reaction database

Thank you

Email: [email protected]

Slides:

http://www.slideshare.net/valerytkachenko16