Development of a Ligand Knowledge Base
Natalie Fey
Crystal Grid Workshop
Southampton, 17th September 2004
Overview Ligand Knowledge Base Synergy of Database Mining and
Computational Chemistry: Part 1: How computational chemistry can add
value to database mining results. Part 2: How database mining can inform a ligand
knowledge base of calculated descriptors.
Ligand Knowledge Base Aims:
Collect information about ligands and their (TM) complexes: Database mining. Computational chemistry
Exploit networked computing and data storage resources – e-Science.
Use data: Interpretation of observations. Predictions for new ligands.
Ligand Knowledge Base
Mine Structural Databases(e.g. CSD)
Compile systematic structural information about TM complexes
Computational Chemistry(e.g. DFT)
Calculate structuraland electronic parametersfor known and unknown
TM complexes
LigandKnowledge
Base
Part 1: “Unusual” Geometries
statistical analysis of results
apply outlier criteria
DFT geometryoptimisation
Query CSD for structural pattern
Main Geometry / Trends
Outliers
Optimised Geometries
Crystal Structureand DFT agree
Crystal Structureand DFT disagree
compare withcrystal structures
Automatic
Part 1: “Unusual” GeometriesCrystal Structureand DFT agree
Structure Report
Note in database,may confirm by DFT
Additional results,add to database
Flag for detailed investigation
Why outlier?
Comment aboutstructure?
Yes No
Further calculations
Value Added
Part 1: “Unusual” GeometriesCrystal Structureand DFT disagree
Structure Report
Revised Calculations
Crystal Structureand DFT agree
Crystal Structureand DFT disagree
Note in database
Flag for detailed investigation
Problem with Calculation
Problem withStructure
Why?
Comment aboutstructure?
Yes No
Value Added
Further calculations
Additional results,add to database
Example – 4-coordinate Ruthenium Main geometry: tetrahedral (14 structures) 2 square-planar cases: YIMLEL, QOZMEX YIMLEL: cis-[RuCl2(2,6-(CH3)2C6H3NC)2]
Ru
Cl Cl
N N
4-coordinate Ruthenium DFT result: Use as CSD query, any TM…
SIVGAV – Pd Supported by structural
arguments: short Ru(II)-Cl, Ru-CNR. correct range and geometry for Pd.
Run DFT with Pd:
Part 2: P-donor LKB Range of DFT-calculated descriptors for
monodentate P(III) ligands and TM complexes. Capture steric and /-electronic properties.
Identification of suitable statistical analysis approaches: Interpretation. Prediction.
Part 2: P-donor LKB Role of database mining:
Stage 1: Database generation. Inform input geometries (conformational freedom). Verification of chosen theoretical approach.
Stage 2: Database utilisation. Supply experimental data for regression models. Confirmation of calculated trends.
Examples Stage 1
Conformers:
e.g. P(o-tolyl)3
Method verification:
P P
av. P-R, CSD
1.891.871.851.83
av.
P-R
, ca
lcu
late
d
1.96
1.94
1.92
1.90
1.88
1.86
Cy3
tBu3
Bu3
iPr3
Pr3Et3
Me3
Examples Stage 2:
Solid State Rh-P Distance (Rh(I), CN=4)
2.275
2.325
2.375
2.425
2.275 2.325 2.375 2.425
experimental
predicted
Predicted Value
2.442.402.362.322.28
Re
sid
ua
l
.003
0.000
-.003
Conclusions Synergy of approaches allows to add value to
structural databases. Computational chemistry can be used to verify solid
state geometries. Can exploit e-Science resources to add value on a
large scale. Utility of large databases for structural chemistry of
transition metal complexes. Computational requirements. Statistical analysis.
Acknowledgements Guy Orpen, Jeremy Harvey Athanassios Tsipis, Stephanie Harris Ralph Mansson (Southampton) Funding:
Top Related