Screening heuristics pope-final
-
Upload
andypopeuk -
Category
Technology
-
view
432 -
download
0
description
Transcript of Screening heuristics pope-final
Andy Pope
Platform Technology & Science, GlaxoSmithKline,
Collegeville PA, USA
SLAS 2012, San Diego February 4-8, 2012
Screening Heuristics & Chemical Property Bias - New directions for Lead Identification and Optimization
Screening Heuristics
Why Screening Heuristics?
1. Huge complex datasets screening wisdom? (customers)
2. Refining approaches/deliverables success rates attrition
Marketed Drugs et.
>103
Phys-chem
DMPK >105
Safety profiling
>50 FBDD >20
Some available datasets inside GSK
ELT >150
FS
>200
HTS >300
Target class
profiling >300
Program profiling
>500
GSK
Compounds + Data >>106
Public Data
Public Data
Descriptor metadata
Descriptor metadata
Descriptor metadata
Descriptor metadata
Des
crip
tor
met
ad
ata
Des
crip
tor
met
ad
ata
Des
crip
tor
met
ad
ata
Hit ID Compound
Profiles Structures
+ Properties
Other GSK Data – e.g. genomic, bio-informatic, clinical
e.g. PubChem e.g. Literature,
Connectivity
Maps
300+ HTS Campaigns – 2004-11
Assay te
chn
olo
gy (15
classes)
Target class (13 classes)
2007-11 screens – sized by count of screens
Twin approaches to screening heuristics
1. Building Collective wisdom
- Capture, combine and share the experiences of screeners and data from screens (and screeners)
2. New “big” data analysis/ insights
- Look for data patterns in large aggregated datasets
e.g. How well do different assay methods perform? What is the impact of screen quality and what should be targeted in assay development? What policies do I need in place to have a high quality screening process? Which assay technology works best?
e.g.
Do chemical properties influence the results of
screens?
How are screen results related between targets
and assay methods?
Which is the best method to use to discover hits?
How are library properties reflected in the hits?
From SBS Virtual Seminar Series 2007 - HTS Module 1
Building Collective Wisdom – a simple example
Some Questions; - What actually happens in practice as z’ varies? - What z’ should we be aiming for? - Is this affected by the type of assay? - What is the appropriate trade off between cost, robustness and sensitivity? - How are we doing?
Average Z’ of assay in HTS production
Average Z’ of assay in HTS production
Average Z’ of assay in HTS production
Pro
du
ctio
n f
ailu
re r
ate
(%
of
pla
tes)
Cyc
le t
ime
(w
eeks
/cam
pai
gn)
Stat
isti
cal c
ut-
off
(%
eff
ect
)
Avge. Z’
0.4-0.5
0.5-0.55
0.6-0.65
0.55-0.6
0.65-0.7
>0.8
0.75-0.8
0.7-0.75
Z’ Heuristics
- Z’ >0.8 is ideal, >0.7 acceptable - Z’ <0.7 many aspects of performance degrade (e.g. failures, cycle times, false +ve/-ve, hit confirmation) - Z’ vs “sensitivity” trade-off arguments may be based on false hunches - Target & assay type does not make a major difference
Properties, properties, properties…..
….But, do they affect screening data? ….are we selecting hits with the best properties?
….Bottom line; High cLogP (greasiness) is BAD ...This needs to be fixed at the start ..i.e in hit ID ….and tends to creep up during Lead Op.
Do compound molecular properties impact how they behave in screens?
Polar Surface Area (tPSA, Å2)
Hit
Rat
e (
%)
Compounds with tPSA 80-85 Å2
26M measured responses in this bin - 485k marked as “hit”
Hit rate = 100*(485k/26M) = 1.86%
The total polar surface area (tPSA) is defined as the surface sum over all polar atoms < 60 A2 predicts brain penetration > 140 A2 predicts poor cell penetration
- Hit rate for Compounds in specific tPSA bin
Aggregate results from all 330 campaigns 2005-2010 with >500K tests
e.g. Compound total polar surface area (tPSA) makes no difference
“hit” = % effect => 3 RSD
of sample population in
that specific screen
Size Matters……
MW
% C
pd
s in
MW
Bin
Cu
mu
lative % C
pd
s
Middle 80% of Cpds 270 470
Molecular Weight (MW)
Hit
Rat
e (
%)
1.50%
2.62%
4.0%
1.2%
Overall Hit rate rises 1.7-fold across the middle 80% of the screening deck
i.e. 70% rise in hit rate from MW = 270 to MW = 470
3.3-fold rise across full MW range - Only bins containing 1M or more records are shown
Greasiness matters most……
ClogP
% C
pd
s in
Clo
gP B
in C
um
ulative %
Cp
ds
Middle 80% of Cpds 1 5
ClogP
Hit
Rat
e (
%)
1.14%
3.31%
4.5%
1.1% Overall hit rate rises 2.9-fold across the middle 80% of the screening deck
i.e. from ClogP = 1 5
4.1-fold rise across full ClogP range
- Only bins containing 1M or more records are shown
HTS Promiscuity - cLogP cL
ogP
Inhibition frequency Index* (%)
Note; Compounds
required to have been
run in 50 HTS and
yielded > 50% effect in
a single screen to be
included
*Inhibition frequency index (IFI) = % of screens where cpd yielded >50% inhibition, where total screens run => 50
Compounds hitting ~1 target
Compounds hitting >10% of targets
Frequency at bin > Frequency at bin > Frequency at bin > Frequency at bin >
“Dark” Matter is small and polar
Mo
lecu
lar
We
igh
t (D
a)
cLo
gP
– Compounds which have not yielded >50% effect once in >50 screens
Biases translate to full-curve follow-up and beyond
cLogP
% C
om
po
un
ds
Test
ed
Molecular Weight
% C
om
po
un
ds
Test
ed
Elevated testing of large, lipophilic compounds in the full-curve phase of HTS
Reduced testing of small, polar compounds in the full-curve phase of HTS
Note; Plots represent data from 402M single-concentration responses & 2.1M full-curve results
Property bias in primary HTS hit marking are propagated forward to dose-response follow-up
SS testing FC testing FC – SS differential
Property bias detection at an individual screen level
e.g. Screens with largest response to cLogP
cLogP
Hit
rat
e a
s %
of
HR
at
cLo
gP =
3.5
Assay Technology vs. property bias H
it r
ate
as %
of
HR
at
cLo
gP =
3.5
cLogP
Colored by Hit rate (%)
e.g. By assay technology, normalized to HR for that screen at median collection cLogP value
e.g. No clear origins in any meta-data - Assay Technology, Target class, Screen quality etc. …. But effects detectable even at single screen level
1.28%
3.80%
Hit
Rat
e (
%)
ClogP
2.14%
Hit
Rat
e (
%)
(MW)
2.27%
Pretty flat
Lipophilicity trends in PubChem HTS Data
Primary data from around 100 Academic HTS campaigns obtained from
PubChem BioAssay
Lipophilicity – similar to GSK HTS Compound size – little effect
GSK screening deck (>50 HTSs, 2.01M cpds) ClogP = 0.00835*MW – 0.058, R2 = 0.18
PubChem Compounds (405k) ClogP = 0.00554*MW + 0.97, R2 = 0.09
X%
Y%
Hit
Rat
e (%
of
cpd
s >5
0%
I) a
t 1
0 u
M
ClogP
Not just HTS… Lipophilicity trends in kinase focused set screens
Primary data from ~50 focused screen campaigns against protein kinases
Lipophilicity and size – similar to GSK HTS
MW
X%
Y%
Hit
Rat
e (%
of
cpd
s >5
0%
I) a
t 1
0 u
M
Property R2, ± vs MW R2, ± vs
ClogP
MW 1, + 0.21, +
ClogP 0.21, + 1.0, +
HAC 0.92, + 0.19, +
fCsp3 0.15, + 0.00
RotBonds 0.36, + 0.04, +
tPSA 0.16, + 0.08, -
Chiral 0.02, + 0.00
HetAtmRatio 0.02, - 0.34, -
Complexity 0.31, + 0.02, +
Flexibility 0.02, + 0.00
AromRings 0.22, + 0.16, +
HBA 0.11, + 0.10, -
HBD 0.01, + 0.02, -
Bias from other simple chemical properties? H
it R
ate
(%)
Fraction of carbons that are sp3 (fCsp3)
+ve cLogP MW (HAC)
-ve fCsp3 flexibility
Improving hit marking – Property Biasing
Hit
Rat
e (%
)
Ordinary HTS Hit Marking Property-biased Hit Marking
MW
ClogP
Hit
Rat
e (%
) Ordinary HTS Hit Marking Property-biased Hit Marking
More attractive properties - promote
Less attractive properties - demote
% C
om
po
un
ds
Mean + 3 x RSD cut-off
RESPONSE (% control)
ClogP
(% o
f to
tal c
om
po
un
ds
in H
TS)
- 2004
- 2010
- D 2010 <> 2004
Year
% C
om
po
un
ds
Exce
ed
ing
Pro
pe
rty
Lim
it
New
2011
ClogP > 5
MW > 500
CCE Acquisition, Property Bounds 2004-05: Lipinski criteria (MW<500, ClogP<5) Most recently: MW<360, ClogP<3 Inclusion of DPU lead-op cpds: MW<500, ClogP<5
Evolving the screening collection…
GSK’s Compound Collection Enhancement (CCE) strategy - moving the HTS deck towards decreased size and lipophilicity with the aim of improving chemical starting points
Compounds tested in HTS test datasets
Can property biases translate into lead optimization?
Mo
re po
tent in
bio
chem
pIC
50
Cel
l - B
ioch
em
Mo
re p
ote
nt
in c
ell
Binned cLogP
Biochemical target assay
Cellular “mechanistic” target assay
Rodent DMPK, efficacy model
Med. chem
Example from current Lead Optimization Program -Cellular activity favors cLogP >4 - Directional “pull” to more lipophilic cpds? -Good DMPK at cLogP <3 - Value of cellular assay?
“patient in a
plate”
Or…….
“biochemistry
in a (grease-
selective) bag”!
Property bias in broad pharmacological profiling
Binned ClogP
Ave
rage
% o
f as
says
giv
ing
IC5
0 <
=1
0 u
M
Marketed drugs
GSK Terminated Leads & Candidates
n = ~2500
n = ~400
n = ~1000
Binned ClogP
Ave
rage
% o
f as
says
giv
ing
IC5
0 <
=1
0 u
M
GPCR’s – 17
Ion Channels – 8
Enzymes – 3
Kinases – 4
Nuclear Receptors – 2
Transporter – 3
Phenotypic – 3 (Blue Screen, Cell Heath, Phospholipidoses)
Early safety cross screening panel (eXP)
GSK Lead Op. compounds 2009-11
n = ~2500
Property bias in broad pharmacological profiling
n = ~2500
Binned ClogP
Ave
rage
% o
f as
says
giv
ing
IC5
0 <
=1
0 u
M
GPCR’s – 17
Ion Channels – 8
Enzymes – 3
Kinases – 4
Nuclear Receptors – 2
Transporter – 3
Phenotypic – 3 (Blue Screen, Cell Heath, Phospholipidoses)
Early safety cross screening panel (eXP)
GSK Lead Op. compounds 2009-11
n = ~2500
Kinome profiling – no impact of cLogP
Binned ClogP
Kinase structural classifier
% in
hib
itio
n v
alu
es (
>30
0 k
inas
e a
ssay
s)
% in
hib
itio
n v
alu
es
(>3
00
kin
ase
ass
ays)
~400 kinase Lead Op Compounds vs 300 protein kinases
Conclusions
Heuristic approaches allow both refinement of best practice and new insights
Standard screening processes favor the selection of lipophilic compounds - A contributing factor in current issues with drug Lead/Candidate property space
occupancy
- Improvement in screening collections and analysis methods can overcome this, BUT
- All this effort is wasted if Lead Optimization pathways pull compounds back towards
unfavorable property space!!
The very large datasets generated from screening have considerable value beyond the lifetime of individual campaigns
- Particularly crucial now that quality and cycle time problems are largely solved
- Many other examples exist beyond those shown here
- Please go look for these effects in your data!
Acknowledgements
Pat Brady Darren Green Stephen Pickett Sunny Hung Subhas Chakravorty Nicola Richmond Jesus Herranz Gonzalo Colmeranjo-Sanchez
…and numerous others who contributed to programs run by GSK 2004-2011…..
Tony Jurewicz Glenn Hofmann Stan Martens Jeff Gross
Snehal Bhatt Stuart Baddeley James Chan Sue Crimmin Emilio Diez Maite De Los Frailes Bob Hertzberg Deb Jaworski Ricardo Macarron Carl Machutta Julio Martin-Plaza Barry Morgan Juan Antonio Mostacero Dave Morris Dwight Morrow Mehul Patel Amy Quinn Geoff Quinique Mike Schaber Zining Wu Ana Roa And colleagues...
Screening & Compound Profiling