David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

32
David Evans and George Papadatos Lilly Research Centre, Erl Wood Manor, Windlesham, UK 22 nd September 2011

Transcript of David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Page 1: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

David Evans and George Papadatos

Lilly Research Centre, Erl Wood Manor, Windlesham, UK

22nd September 2011

Page 2: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Discover new chemotypes

• Multiobjective space • Isosteres in activity

• Improvements in properties

• Want to use multiple tools in same environment

• But understand what works when

Page 3: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Open Source Workflow tool – main client is free

• But support is available and can integrate commercial vendors + in-

house code as nodes

• Have released many Erl Wood nodes to KNIME community site

• http://tech.knime.org/community/erlwood

Page 4: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

FieldAlign

Xedmin

Xedex

Xedmin •XED minimization

•2D -> 3D

Xedex • Conformational

analysis

FieldView •Launches FieldView

•View field points +

energies + other data

All nodes pass SDF

Page 5: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

FieldAlign • Flexible alignment

of query molecules

onto template

Page 6: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Company Confidential Copyright © 2008 Eli Lilly and Company

Process is more than just the database search

WHY ?

Page 7: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Don’t want

to load all

databases

onto all

users’ PCs

Command-

line search

SOAP Web

Service

•Apache

Tomcat

node

Platform-independent communication

+ secure

intranet !

Page 8: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Read in pre-built hypothesis

(MOE, Phase)

• Or sketch from template molecule

• Jmol based visualizer

• Can also annotate and filter hits,

aids manual inspection

Non-proprietary structure

Page 9: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Maximum Unbiased Validation (MUV) dataset

• 17 targets, total 30 ligands and 15000 decoys per target,

source: PubChem bioactivity data.

• Wide-ranging targets: hormone receptors, kinases, proteases,

GPCRs plus others (e.g. HSP90, HIV RT).

• Unbiased for chemical analogues as MUV ligands pre-

clustered with 2D fingerprint

•1.16 compounds per scaffold class

MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.

How well do automated pharmacophore

methods do compared to 2D methods?

Page 10: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
Page 11: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Have looked at whole molecule similarity

• Is there more data if we find fragments which maintain activity?

• Matched Molecular pair analysis (MMP) • Fragments compounds and finds pairs where only one fragment differs

Page 12: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

The mining and statistical analysis of transformations and their impact on properties of interest (e.g. solubility or activity)

left molecule right molecule transformation ΔSolubility (mgml)

-0.8

+1.2

+2.4

H F

Br OCH3

Page 13: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

It used to be a slow and computationally expensive process...

• Pair-wise maximum common substructure extraction – O(N2)

Recently a much more efficient algorithm was published

* * >>

1) Cleave all acyclic single bonds, one by one:

2) Index all the fragments (cf. book index):

3) Enumerate the values for each key:

Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348.

(*in an automated and unsupervised way)

Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.

Mol A >> Mol B

Page 14: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

In: MolRegnos (IDs), structures (in RDKit format) and property values

Out: Matched pairs (left and right molecule, IDs, transformation, property values, ΔP, context, transformation atom count)

Available as an Erl Wood community contribution node

Page 15: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Find isosteres in chEMBL

chEMBL – Database of published medicinal chemistry activity data

– Using chEMBL_10 , total >1,000,000 compounds

Use here just human protein kinase inhibitors

Quality assurance for chEMBL data (SQL statement) • Med. chem. friendly compounds, parent structure, not downgraded,

confidence score = 9, exact IC50 or Ki values only (converted to pIC50/pKi) ~14K data points

• Compare biological values coming from the same assay ID only

Aggregate transformations; calculate and bin ΔpIC50s in 3 bins

• Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)

Page 16: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Each

transformation

has a neutral

count

• Absolute value

or percentage:

NeutralCount%

Page 17: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

chEMBL workflow outputs isosteric fragments

How similar are

isosteres in 2D

fingerprint space?

In field space?

Could fields help us

find unexpected

isosteres?

Page 18: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• 1802 fragment pairs from chEMBL_10 kinase data set

• 481 with no rotatable bonds left or right

• Simplifies conformational analysis

• For each fragment pair

1. Swap attachment points for adamantyl

2. FieldAlign to get field similarity (Use adamantyl to

constrain overlay)

3. RDKit fingerprint similarity – topological Daylight-esque

4. Correct similarities for adamantyl

• Are there isosteric pairs with high field similarity but low RDKit

similarity?

Page 19: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Larger more

isosteric

Pairs with high

field similarity

but low 2D

similarity

Pairs with high

field and 2D

similarity

Page 20: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Only those with

>60% isosteric

examples

Thiophene -> Phenol

Page 21: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Only those with

>60% isosteric

examples

Imidazole->

Morpholine?

Page 22: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Only those with

>60% isosteric

examples

Some small

fragments

Page 23: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

WEE1

kinase

PDB 2I06

Non-proprietary structure

(from PDB)

Solvent-

exposed

Buried

Page 24: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Only those with

>60% isosteric

examples

Me-tetrazole ->

oxadiazole

Page 25: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Neutral

Count %

Only those with

>60% isosteric

examples

Thiophene ->

phenol

Page 26: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Non-proprietary structure

(from PDB)

Page 27: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• 6299 data points from thermodynamic solubility assay

• 423 single-point transformations

• 215 no-rotatable point transformations

• Aggregate transformations; calculate and bin ΔlogS in 3 bins

• Good – Bad – Neutral (c = 0.3 log units)

• Are there transformations which increase solubility with low

field similarity but high RDKit similarity?

Page 28: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Good

Count %

Only those with

>60% boosting

examples

Ring contraction

+ twist ?

Page 29: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Field

Sim

RDKit Sim

Size by Good

Count %

Only those with

>60% boosting

examples

Big boost from

morpholine

Page 30: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

• Can mine chEMBL data for non-obvious isosteres

• Will other data sets find more?

• Would like to improve workflow to make isostere data set for

3D similarity comparison

• Improve fragmentation/conformer/ alignment handling?

• Need to include whole molecule?

• Need 3D binding site data as well to confirm isosterism?

• KNIME platform developing

• Virtual screening and evaluation environment

• Rapid experimentation with varied tools

• http://tech.knime.org/community/erlwood

Page 31: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

George Papadatos

Juliette Pradon

Hina Patel

Nikolas Fechner

David Thorner

Michael Bodkin

KNIME, chEMBL + Cresset !

Page 32: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

ROC curves for

retrieval of >66%

isosteric groups

Field similarity

performs better

than RDKit

But AUC = 0.68

Workflow not

optimized for

this purpose