303829/FULLTEXT01.pdfAuthor . Markus K. Dahlgren, Department of Chemistry . Umeå University,...
Transcript of 303829/FULLTEXT01.pdfAuthor . Markus K. Dahlgren, Department of Chemistry . Umeå University,...
STATISTICAL MOLECULAR DESIGN, QSAR MODELING, AND SCAFFOLD
HOPPING – DEVELOPMENT OF TYPE III SECRETION
INHIBITORS IN GRAM NEGATIVE BACTERIA
MARKUS K. DAHLGREN
AKADEMISK AVHANDLING
UMEÅ UNIVERSITY
DEPARTMENT OF CHEMISTRY 2010
2
COPYRIGHT © 2010 MARKUS DAHLGREN
ISBN: 978-91-7264-976-7
3
PRINTED IN SWEDEN BY VMC-KBC UMEÅ
4
Title Statistical molecular design, QSAR modeling, and scaffold hopping - Development of type III secretion inhibitors in Gram negative bacteria
Author Markus K. Dahlgren, Department of Chemistry Umeå University, SE-90187, Umeå, Sweden
Abstract Type III secretion is a virulence system utilized by several clinically important Gram-negative pathogens. Computational methods have been used to develop two classes of type III secretion inhibitors, the salicylidene acylhydrazides and the acetylated salicylanilides. For these classes of compounds, quantitative structure-activity relationship models have been constructed with data from focused libraries obtained by statistical molecular design. The models have been validated and shown to provide useful predictions of untested compounds belonging to these classes. Scaffold hopping of the salicylidene acylhydrazides have resulted in a number of synthetic targets that might mimic the scaffold of the compounds. The synthesis of two libraries of analogs to two of these scaffolds and the biological evaluation of them is presented.
Keywords Statistical molecular design, QSAR, synthesis, type III secretion, virulence, scaffold hopping
ISBN: 978-91-7264-976-7
5
Contents
1. LIST OF PAPERS ............................................................................................ 1
2. ABBREVIATIONS ........................................................................................... 3
3. INTRODUCTION ............................................................................................ 5
3.1. COMPUTATIONAL DRUG DESIGN ................................................................ 5 3.1.1. Statistical molecular design ................................................................. 6 3.1.2. QSAR modeling .................................................................................... 9 3.1.3. Scaffold hopping ................................................................................ 10
3.2. BACTERIAL VIRULENCE ........................................................................... 11 3.2.1. Type III secretion ............................................................................... 12 3.2.2. Type III secretion inhibitors ............................................................... 13
4. SCOPE OF THIS THESIS ............................................................................. 15
5. STATISTICAL MOLECULAR DESIGN, SYNTHESIS, AND BIOLOGICAL EVALUATION OF TYPE III SECRETION INHIBITORS .... 17
5.1. SALICYLANILIDES (PAPER I) .................................................................... 17 5.2. SALICYLIDENE ACYLHYDRAZIDES (PAPER II) .......................................... 20
6. QSAR MODELING OF TYPE III SECRETION INHIBITORS ............... 25
6.1. SALICYLANILIDES (PAPER I) .................................................................... 25 6.2. SALICYLIDENE ACYLHYDRAZIDES (PAPER II) .......................................... 29 6.3. EVALUATION OF QSAR MODELS USING EXTERNAL TEST SETS ................ 35
6.3.1. External test set for the salicylanilides (paper I) ............................... 35 6.3.2. External test set for the salicylidene acylhydrazides (paper II) ......... 36
6.4. QSAR MODELS USED TO DEVELOP AZIDE CONTAINING T3S INHIBITORS FOR TARGET IDENTIFICATION ................................................................................. 39
6.4.1. Azide containing salicylanilides ......................................................... 39 6.4.2. Azide containing salicylidene acylhydrazides .................................... 40
6.5. CONCLUSIONS FROM QSAR MODELING .................................................. 42
7. SCAFFOLD HOPPING FROM A SALICYLIDENE ACYLHYDRAZIDE 43
7.1. 2-(2-AMINO-PYRIMIDIN-4-YL)-2,2-DIFLUORO-1-(PHENYL)-ETHANOLS (PAPER III) ............................................................................................................. 44 7.2. THIAZOLES (PAPER IV) ............................................................................ 49 7.3. OTHER SCAFFOLDS .................................................................................. 51
6
8. CONCLUDING REMARKS ......................................................................... 53
9. ACKNOWLEDGEMENTS ........................................................................... 55
10. REFERENCES ............................................................................................... 59
1
1. List of papers
This thesis is based on the following papers, which will be referred to in the text by their roman numerals I-IV. I Design, Synthesis, and Multivariate Quantitative
Structure-Activity Relationship of Salicylanilides-Potent Inhibitors of Type III Secretion in Yersinia Dahlgren, M. K.; Kauppi, A. M.; Olsson, I.-M.; Linusson, A.; Elofsson, M. J. Med. Chem.; 2007; 50(24); 6177-6188. DOI: 10.1021/jm070741b
II Statistical molecular design of a focused salicylidene
acylhydrazide library and multivariate QSAR of inhibition of type III secretion in the Gram-negative bacterium Yersinia
Dahlgren, M. K.; Zetterström, E. C.; Gylfe, Å.; Linusson, A.; Elofsson, M.
Bioorg. Med. Chem.; Article in Press DOI: 10.1016/j.bmc.2010.02.022 III Synthesis of 2-(2-amino-pyrimidine)-2,2-difluoro-ethanols
identified through scaffold hopping from a salicylidene acylhydrazide
Dahlgren, M. K.; Öberg, C.; Wallin, E.; Jansson, P.; Elofsson, M.
Manuscript IV Synthesis of [4-(2-Hydroxy-phenyl)-thiazol-2-yl]-
methanones, structural analogs of salicylidene acylhydrazides
Hillgren, M.; Dahlgren, M. K.; Tam, T. M.; Elofsson, M. Manuscript in preparation Papers I and II are reprinted with kind permission of the publishers.
2
3
2. Abbreviations
BB building block Equiv. equivalents FD factorial design PCA principal component analysis PLS partial least-squares regression to latent structures PLS-DA PLS-discriminant analysis MLR multiple linear regression QSAR quantitative structure-activity relationship QSPR quantitative structure-property relationship SAR structure-activity relationship SMD statistical molecular design Y. Yersinia TEA triethylamine T3S type III secretion MM molecular mechanics ADME absorption, distribution, metabolism, excretion det determinant DOOD D-optimal onion design HOMO highest occupied molecular orbital LUMO lowest unoccupied molecular orbital
4
5
3. Introduction
Medicinal chemistry is a discipline involving the design, synthesis, and optimization of small organic molecules as part of the development of drugs and research tools. Drug discovery often starts with a validated hit compound that is identified through unbiased biological screening of compound libraries using robust assays that are representative for a specific disease.1 Analogs to the hit compound are then synthesized in order to ensure that the structure of the hit compound can be varied without entirely loosing biological activity. The compound is then further optimized by medicinal chemists to increase potency, selectivity, solubility, etc. The drug development process is an arduous, costly, and time consuming task. Proficient use of computational tools will reduce the time needed to deliver new drug candidates.2 Biologically active compounds can also be used as research tools in a field known as chemical genetics or chemical biology. Such compounds can be utilized to, for instance, identify receptor targets for a class of compounds with an unknown mode of action or help in the elucidation of complex, not fully understood, biological systems. This thesis exemplifies computational techniques that are highly useful in facilitating compound optimization in the absence of a structural target, including design strategies, synthesis, biological evaluation, and QSAR modeling.
3.1. CCoommppuuttaattiioonnaall ddrruugg ddeessiiggnn
Computational chemistry can facilitate drug development at all stages if applied proficiently. If the structure of the target is known a wide range of computational techniques are available, including molecular docking3, 4, pharmacophore mapping of the binding site, and structure based design. If the hit compounds have been identified in for instance cell-based assays and no structural target is known, the computational techniques are limited and applied to the ligands. Simple filters, like Lipinski´s rules,5 can easily be computed to filter out compounds that are unlikely to be developed into, for instance, orally administered drugs. Pharmacophore mapping of key features of the ligands can be used to construct a model that can be used to select promising candidates from virtual libraries. Scaffold hopping methods are usually similarity based and the aim is to identify compounds that should retain key interaction features, such as hydrogen bond donors and acceptors, but replace the scaffold in order to, for instance, increase potency and
6
chemical stability. Scaffold hopping has become a popular tool in pharmaceutical industry to develop so called fast-follower drugs, where scaffold hopping is performed on compounds in clinical trials with the aim to identify new compounds with good patentability that quickly can be developed into drugs. Quantitative structure-activity relationship (QSAR)6, 7 models can be used to predict the biological activity of new compounds belonging to the same class.8-11 The input in such models is numerical descriptions of chemical features and one or more responses (e.g. quantitative biological data from one or more in vitro assays). The responses can be any quantitative assay readout meaning that such models can be used to not only optimize activity, but also other properties, such as solubility, membrane permeability, and reduction of toxicity. When other properties than biological activity are used as responses, the models are called quantitative structure-property relationship (QSPR) models. Statistical molecular design (SMD)12-14 can be used with the preset goal to achieve reliable QSAR models and is a strategy used to systematically vary the chemical features of the compounds believed to be important for the response, effectively reducing the number of compounds to be synthesized while retaining much of the information of the full set.
3.1.1. Statistical molecular design
When a compound class is subjected to a medicinal chemistry project for optimization of one or more properties or responses QSAR modeling can facilitate the process. QSAR models relate chemical features of compounds to responses through regression modeling according to equation 3.1.
Equation 3.1. General form of a QSAR model equation. yi is the ith response, xik is the ith compound, described by k=1…K predictor variables, bk is the model coefficient for each variable, k, and fi is the residual for the ith response.
The predictor variables in the equation are numerical descriptions of chemical features of the molecules which can be expanded to include higher order terms, such as cross and square terms, to give interaction models. By use of SMD, a representative compound set is selected with the aim of
7
obtaining a QSAR model with minimal error in the model´s coefficients, bk, estimated from the data and maximizing the likelihood of reliable predictions. The most commonly used experimental designs in SMD are factorial designs (FDs)15, 16 and D-optimal designs17, 18. Both types of designs are used to systematically vary the chemical features of interest in order to obtain a representative subset of compounds. This subset should as a result of systematic structural variation contain compounds that give a balanced spread in the response measured. In addition the designs select compounds that span the experimental domain (i.e. all possible synthetic targets) in order to minimize the error in the QSAR model coefficients (figure 3.1a). As opposed to SMD, small variation of chemical features of compounds will give large errors in the QSAR model coefficients (figure 3.1b). While FDs systematically vary chemical features at high and low levels, D-optimal designs selects a set of compounds that will span as large volume as possible. In mathematical terms a D-optimal design will select m compounds from a matrix X with K columns (the chemical features investigated) and n rows (the entire candidate set) in such a way that det(X’X), where det denotes the determinant, is maximized.
Figure 3.1 An assay that for each investigated compound gives a variation (error bars) in the measured response, Y, will result in errors in the QSAR model coefficient. The QSAR model coefficient is the slope of the line, established through regression. In each figure two lines are drawn that show the maximum variation of the QSAR model coefficient. a) A large change of a chemical feature X1 gives a more reliable QSAR model coefficient. b) If X1 is varied only slightly the variation of the assay will give a large error of the QSAR model coefficient.
There might be some unforeseeable problems with SMD using FDs or D-optimal designs, for instance, some of the chemical features investigated might not have any correlation with the response and there might be other important chemical features, highly correlated with biological activity, that are not investigated by the selected compounds. To remedy the latter it is
8
usually a good idea to add extra compounds to the designed set. A center point, i.e. a compound that has average values for all properties investigated, should be added to monitor possible nonlinearities. Additional compounds can be added manually or by implementation of a multilayer design, such as D-optimal onion design (DOOD)19. In a DOOD the chemical space is divided into onion layers and a D-optimal design is performed in each layer. DOOD can be centered on any compound, for instance, the compound with the highest biological activity. SMD and QSAR rely on the computation of molecular descriptors, which essentially are numerical descriptions of chemical features. Descriptors are generally classified as 1D, 2D, or 3D descriptors. 1D descriptors are essentially descriptors that can be calculated directly from the molecular formula and generally does not require a program, such as atom counts and molecular weight. 2D descriptors give information about how the atoms are connected and can be properties calculated from connectivity tables, tables that explain how atoms are connected to each other in any given molecule. Examples of 2D descriptors are, for instance, connectivity indices, volumes and areas that are calculated from connectivity tables, surface and volume approximations of certain chemical groups, density, and bond counts. 3D descriptors are conformation dependent and are computed in programs that require a 3D input of each structure. There are essentially four different levels of computation for descriptors, i.e. informatics, molecular modeling, semi-empirical, and quantum chemical descriptors. Informatics descriptors generally do not require programs for computations and examples include many 1D and 2D descriptors. Molecular modeling descriptors are based on force-field mechanics and can be obtained through software such as MOE20 and DRAGON21. Semi-empirical descriptors are derived through regression based on parameters, often established through quantum chemical calculations, defined in the given program. Semi-empirical descriptors can be obtained through software such as MOE and Spartan, for which the semi-empirical calculations have been documented22. Quantum chemical descriptors can be calculated using various software such as Spartan, Jaguar23, and Gaussian24. The descriptors need to be relevant for the response for which the compounds are going to be optimized. It is advisable to select a wide range of descriptors in order to increase the chance of describing features important for the activity that can be used in QSAR modeling. When large sets of descriptors are chosen it quickly becomes problematic to visualize the data set. Additionally, descriptors are often heavily correlated which means that they
9
essentially describe the same chemical feature (e.g. molecular weight, volume, and number of atoms that all represent the chemical feature size). These problems can be addressed by variable selection or by the use of principal component analysis (PCA)25, 26, which is used to compute orthogonal principal components. The principal components describe the main variation of the data and form a hyper plane onto which the entire data set can be projected. The principal components describe chemical features that are uncorrelated, also known as principle properties, which can be used directly as design variables in what is called a multivariate design.27, 28 SMD can be performed at both the building block (BB) and product level. If BBs prove unreactive or otherwise problematic, and an SMD has been performed on the BB level, it is easy to manually exchange them with other BBs that are close in principal component space. QSAR modeling based on BBs selected through multivariate design will offer direct insight into how local structural alterations will affect the response. Additionally, it is computationally more effective to perform SMD at the BB level than the product level.
3.1.2. QSAR modeling
After synthesis and biological evaluation of a designed compound set has been completed, QSAR models can be computed, relating molecular descriptors or principal properties of a set of compounds (i.e. the training set) to one or more responses through regression. Commonly used regression techniques for QSAR modeling are partial least-squares regression to latent structures (PLS)29, 30 and multiple linear regression (MLR)31. Prior to regression modeling, data is sometimes filtered. Orthogonal signal correction (OSC) is a method that can be used to remove descriptors that are orthogonal, i.e. linearly independent, to the response.32 Orthognal projections to latent structures (OPLS) is essentially a PLS method with an integrated OSC filter.33 Support vector machines (SVM) is another type of regression technique that can be used to model nonlinear data.34 In order to compute QSAR models the compounds need to interact with the biological target through the same mechanism, possess an even spread in biological activity, obtained with robust and reproducible assays, and the numerical descriptions of the compounds need to be relevant for the responses. If SMD has been performed prior to QSAR modeling, the designed set should be of manageable size, allowing simultaneous evaluation of the entire compound
10
set in replicates. The biological evaluation should be performed on several occasions to get reliable data. Some compounds from the designed set that are inactive can be included in the training set to get complementary information, but those compounds need to be inactive due to unfavorable interactions with the biological target and not through inability to, for instance, pass through cell membranes. It is often tricky to decide if an inactive compound should be included in the QSAR modeling. One way to investigate whether an inactive compound is suitable for inclusion in the training set is to compute a QSAR model for those compounds that are active and use that model for prediction of all the inactive compounds. Those compounds predicted as inactive could then be added to the training set. QSAR models are useful to gain an understanding of what chemical features that correlate with the responses, even with limited prior knowledge. That information can be extracted from a training set using for instance PLS regression, relating descriptors or principal properties of the training set to one or more responses, and subsequent variable selection. QSAR models are also highly useful for prediction of responses for compounds not synthesized and biologically evaluated. Those compounds used for predictions are called the test set. In order to get reliable predictions, the training set needs to cover the chemical features of the test set. It can therefore be of interest to do a second round of SMD and QSAR modeling to get new models that offer more accurate predictions, by using the former QSAR models´ coefficients as design parameters. The test set should be selected, synthesized, and biologically evaluated after the QSAR models have been computed to ensure unbiased evaluation of the models. These types of test sets are usually called external test sets. The predictive power of any given model is usually not as good as indicated by the Q2 value,35 and therefore it is of utmost importance to critically test and evaluate the models with external test sets.
3.1.3. Scaffold hopping
SMD is usually used to vary the substitution pattern on a given scaffold or the BBs used to synthesize a library with a common scaffold. Scaffold hopping36 on the other hand usually aims to keep key interaction points and favorable substitution patterns and instead change the scaffold of a compound class. The two techniques are therefore complementary and can be used in conjunction. 3D scaffold hopping methods have been published that outperform 2D methods,37, 38 which consider flexibility, geometry, and
11
pharmacophore-like molecular properties. A more recently published method is SHOP39 that considers geometrical features of the scaffold, shape, and alignment-independent GRID descriptors.36 A receptor-based scaffold hopping method has recently been developed that is incorporated in SHOP.40 Scaffold hopping can be used in early drug discovery to identify additional lead compounds (backup leads) that should lower the chance of drug development attrition due to such factors as, for instance, undesirable absorption, distribution, metabolism, and excretion (ADME) properties. The creation of intellectual property is also facilitated. In addition the method can be used for finding bioisosteres. In this thesis the program SHOP39 has been used and is the only software discussed in detail.
3.2. BBaacctteerriiaall vviirruulleennccee
Infectious diseases, with a substantial contribution from pathogenic bacteria, are the leading cause of death world-wide.41 Antibiotics, i.e. compounds that kill or inhibit growth of bacteria, have proven to be very effective against infectious diseases caused by pathogenic bacteria in those regions where they have been available. Antibiotics that target the bacterial cell wall (for example penicillin), or cell membrane, or interfere with essential bacterial enzymes are usually bactericidal (killing bacteria) in nature, while those that target protein synthesis are usually bacteriostatic (inhibitors of bacterial growth).42 Even though antibiotics have been highly successful against infectious diseases, they are not without side effects. Since they target general features common to most bacteria, essential bacteria in the intestinal flora will also be affected by antibiotics causing adverse side effects. Multidrug resistant bacterial strains have surfaced which resist most available treatments available on the market. Antibiotic resistance is a result of selection for organisms that have enhanced ability to survive doses of antibiotics that previously would have been lethal. Those bacteria which have developed resistance allowing them to withstand an antibiotic treatment will survive and live on to reproduce.43 They will then pass on that trait, which will result in a fully resistant colony. Resistance can also be the result of horizontal gene transfer, in which a bacterium can incorporate genetic material from another bacterium without being its offspring.44 Since antibiotics target non-pathogenic bacteria as well, resistance will be developed rapidly even outside of the host. This puts a demand for new antibiotics to combat resistant strains, but to avoid the rapid development of resistance, new strategies to combat bacterial infectious diseases are needed.
12
Targeting the functions through which bacteria are able to evade the immune response or establish disease might potentially halt or slow progress of bacterial disease and reduce risk of giving rise to resistant strains.41 The methods through which different bacteria invade the host, evade the host immune response, proliferate within the host, and establish disease are broadly termed virulence mechanisms. Developing drugs targeting virulence mechanisms generally poses a bigger challenge than the development of traditional antibiotics since such systems usually require activation, either artificial or by placement of the bacteria in an environment where the specific virulence system is triggered. General examples of virulence mechanisms include, but are not limited to, bacterial adhesion to host cell surfaces, secretion and translocation of toxins and immune response inhibitors, quorum sensing, invasion of host tissue or host cells, and colonization of compartments within the host.41 In addition to anti-virulence drugs, small organic molecules that have been identified as virulence inhibitors through biological screening can be used as research tools to elucidate complex, not well understood biological mechanisms involved in expression, regulation, and function of the virulence system.
3.2.1. Type III secretion
Type III secretion (T3S) is a virulence system found in several Gram-negative animal pathogens, such as Yersinia spp., Pseudomonas aeruginosa spp., Chlamydia spp., Salmonella, spp., and Shigella flexneri spp.45 The virulence system also exists in several Gram-negative plant pathogens. The function of T3S varies between different bacterial species. The molecular events during Yersinia infections have been extensively studied,45-48 and Yersinia thus serves as an excellent model organism to study T3S and evaluate inhibitors.48, 49 In this thesis Yersinia pseudotuberculosis (Y. pseudotuberculosis) has been used as a model organism for evaluation of T3S inhibitors. When Y. pseudotuberculosis senses contact with a eukaryotic cell the bacterium will secrete and translocate effector proteins into the cytosol of the target cell. The effector proteins target specific functions of the target cell, such as phagocytosis and inflammatory responses,46 allowing the bacteria to subvert it and proliferate (figure 3.2a). A T3S inhibitor would stop the secretion, thus preventing the bacteria from injecting the effector molecules into the cytosol of the target cell. The disarmed bacteria would be
13
eliminated through phagocytosis and the infection would be cleared (figure 3.2b).
Figure 3.2. Schematic representation of a Yersinia infection; a) The bacterium will sense contact with the eukaryotic cell and adhere to it. The cytosol of the target cell will be injected with effector molecules that will turn off the immune response and subvert the target cell. In absence of a functional immune defense, the bacteria will proliferate; b) The addition of a T3S inhibitor will prevent the injection of effector molecules, resulting in a functional immune defense that can clear the infection.
3.2.2.Type III secretion inhibitors
Prior to the work described in this thesis, a number of T3S inhibitors were identified and published within the research group.50 Three classes of inhibitors were identified, an acetylated salicylanilide that was a singleton in the biological screening campaign (figure 3.3a), salicylidene acylhydrazides (figure 3.3b), and a 2-arylsulfonylamino-benzanilide (figure 3.3c) that also was a single hit within its class, were all further investigated through synthesis of analogs and biological evaluation. The 2-arylsulfonylamino-
Adhere Inject ProliferateSubvert
a)
Adhere Phagocytosis Clearance
T3S Inhibitor
b)
14
benzanilides were subjected to SMD, synthesis, biological evaluation and QSAR modeling.51 The acetylated salicylanilide was a singleton in the screening campaign. Three analogs were synthesized and biologically evaluated, where the acetyl group was exchanged with a propanoyl or butanoyl group or the salicylanilide was left unacetylated.52 A number of salicylidene acylhydrazides were synthesized and evaluated for their ability to inhibit T3S.53 Since that study, the salicylidene acylhydrazides have been used extensively as research tools to study the function of T3S in a wide range of organisms where they are active.54 In this thesis, the acetylated salicylanilides and the salicylidene acylhydrazides were subjected to SMD, synthesis, and QSAR modeling (figures 3.3b and 3.3d).
Figure 3.3. The structures of the compounds identified from biological screening and the general structures of the compounds studied in this thesis. a) the acetylated salicylanilide that was a singleton in the biological screening campaign; b) the general structure of the salicylidene acylhydrazides; c) the 2-arylsulfonylaminobezanilide that was a singleton in the biological screening campaign; d) the general structure of the acetylated salicylanilides.
OO
NH
O
R1
R2a)HO
NNH
R2
OR1 NH
Cl
NH
OS
O
ON
S NCl
Cl
OO
NH
OI
I
Cl
b) c) d)
15
4. Scope of this thesis
This thesis describes the use of computational tools to optimize T3S inhibitors. The SAR of the acetylated salicylanilide had not been investigated beyond manipulation of the acetyl group and the first goal was to investigate the SAR by variation of the substitution pattern on both the salicylic acid and aniline ring moieties. The information gained from the SAR would be used to guide an SMD of the compound class that hopefully would lead to the final goal, namely the establishment of a QSAR model. The salicylidene acylhydrazides were to be subjected to SMD directly, using information from the previously published compounds, with the aim to establish a QSAR model also for this compound class. In the later part of the graduate studies focus was shifted towards finding alternatives to the salicylidene acylhydrazides, since a number of challenges associated with the core of the compounds were identified. The salicylidene acylhydrazides are interesting from a biological and possibly a clinical perspective in that they inhibit T3S in several relevant Gram-negative organisms. Scaffold hopping of the central fragment with subsequent synthesis of a small number of resulting scaffolds was planned for the last part of this thesis.
16
17
5. Statistical molecular design, synthesis,
and biological evaluation of type III
secretion inhibitors
All synthesized compounds presented in this thesis were evaluated for their ability to inhibit T3S in a reporter-gene assay as previously described.53 The biological read-out from this assay (% inhibition of luciferase light emission) is directly proportional to inhibition of the reporter-gene. To verify that the inhibition was not a result of direct interference with luciferase or the light signal, an additional assay based on the secreted effector molecule YopH51 was used for all the salicylidene acylhydrazides. Another method used to verify inhibition of secretion was Western Blot,50 that was used to evaluate some of the salicylanilides. All salicylidene acylhydrazides and some of the salicylanilides were also tested for bacterial growth inhibition, as previously described,50 to ensure that the observed reporter-gene inhibition was not due to general toxicity. The SARs of the acetylated salicylanilides and the salicylidene acylhydrazides, had previously not been studied in detail. Without any structural information of the biological target we decided to use SMD as a strategy to design focused compound libraries that hopefully could be used to establish QSAR models for both classes of compounds. This section will describe the SMD strategies used.
5.1. SSaalliiccyyllaanniilliiddeess ((ppaappeerr II))
The acetylated salicylanilide (1a, table 5.1) was a single hit from the initial screening50 and only structural variation around the acetyl group had previously been investigated.52 No structural target was known for this compound class. It was roughly three times more potent against T3S than the most potent of the previously published salicylidene acylhydrazides.50, 53 Analogs to 1a could be synthesized via amide coupling of a salicylic or benzoic acid and an aniline. Subsequent acetylation of the salicylic hydroxyl group under acidic conditions gave the acetylated salicylanilide analogs
18
(scheme 5.1). Interestingly if the acetylation was performed with pyridine as catalyst acetylation of both the hydroxyl and the amide groups was observed.
Scheme 5.1. Synthesis of analogs to the acetylated salicylanilide 1a.
An initial SAR study was performed, through the synthesis and biological evaluation of two previously synthesized (1a and 1b)52 and five new analogs (table 5.1).
ID R Structure Reporter-gene signal inhibition at four compound
concentrations 100 μM
‡ 50 μM
‡ 20 μM
± 10 μM
‡
1a Ac
99 ± 0 99 ± 0 98 ± 0 76 ± 1
1b H 99 ± 1 100 ± 0 100 ± 0 100 ± 0
2a Ac
100 ± 0 100 ± 0 100 ± 0 99 ± 2
2b H 78 ± 6 80 ± 3 85 ± 1 91 ± 1
3a Ac
99 ± 1 92 ± 3 44 ± 11 23 ± 4
3b H 100 ± 0 99 ± 0 64 ± 4 14 ± 1
4 -
- - - -
Table 5.1. Six SAR compounds were synthesized to probe the biologically tolerated structural variation of the original hit 1a. The compounds were evaluated using the reporter-gene assay. ‡Means and standard deviations were calculated from triplicates, and experiments were reproduced on at least two separate occasions.
In the SAR study only structural modification of the salicylic acid moiety was performed. In retrospect, the aniline moiety should have been manipulated as well to investigate whether the aniline moiety could be structurally altered without complete loss of T3S inhibition. The results indicated that the salicylic acid moiety allowed exchange of both iodines with hydrogen atoms without complete loss of biological activity. Interestingly the compounds synthesized from 5-iodo-salicylic acid (2a and 2b) were roughly ten times more potent than the original hit. Exchange of the hydroxyl or O-acetyl groups with hydrogen, as in 4, resulted in complete loss of T3S activity
R
OH
O
R1 H2NR2+
R
NH
O
R1
R2PCl3,Toluene
MWI 150 oC,10 minR = H
R = OH Ac2O, phosphoric acid, 70 oC, 30 min
R = HR = OH
R = OAc
I
I
OR
NH
OCl
I
OR
NH
OCl
OR
NH
OCl
NH
OCl
F
F
19
at compound concentrations as high as 100 μM. Based on these results a second selection of compounds was planned, where different salicylic acids and anilines would be selected to form a virtual library from which a number of targets for synthesis would be chosen. From commercial sources 25 anilines and 22 salicylic acids were chosen based on availability, price, substitution pattern, chemical compatibility, and size. All combinations of the BBs were enumerated, resulting in 550 virtual salicylanilides. A three component PCA model (R2 = 0.97, Q2 = 0.96), describing size, hydrophobicity, density, and connectivity, was used for manual selection of 16 new unacetylated salicylanilides (figure 5.1).
Figure 5.1. Manual selection of salicylanilides from a three component PCA model (first two components shown). The hit compound, 1a, is marked with an open ring. The selected compounds are marked with filled circles. The first PC corresponds to size and hydrophobicity. The second PC represents the density of the compounds (molecular weight divided by molecular volume). The third PC describes hydrophobicity and connectivity of the compounds.
The 16 acetylated and the corresponding unacetylated salicylanilides were synthesized and biologically evaluated. The yields, including the SAR compounds, ranged from 2-60% over two steps. The resulting data did not contain enough active compounds to compute a QSAR model. Only five compounds displayed higher than 50% reporter-gene inhibition at 20 μM concentration, therefore a complementary selection was performed. For this selection DOOD was applied, which was especially attractive since it allowed the design to be performed around the most potent compound. A few
-4
-2
0
2
4
6
-10 0 10
t[2]
t[1]
20
additional BBs with differing size and shape were added to the design to complement the previously used BBs. The BBs were characterized with conformation independent descriptors that mainly described electronic properties, hydrophobicity, size, and surfaces. Two PCA models were computed, one for each BB set. The first three score vectors for each BB set, in combination with the molecular descriptors SlogP (an atomic contribution model that calculates logP from the given structure) and total polar surface area for the products, were used as design variables. The three compounds with the highest activity were set as vertices in an inner shell and the center of the vertices was set as center point for the DOOD. The entire candidate set was divided into layers in such a way that the thickness of the two outer layers was equal to 10% of the thickness of the inner layer. The previously synthesized compounds were set as inclusions in the DOOD, and five new compounds were selected. An additional compound that was geometrically closest to the theoretical centre point was added. Both unacetylated and acetylated versions of the entire compound set were synthesized and biologically tested. The acetylated and unacetylated compounds were generally pair-wise active. Five of the six acetylated and all of the unacetylated compounds showed a dose-response pattern, highlighting the usefulness of DOOD to select compounds with likelihood of being biologically active. In total 51 compounds were synthesized, with yields for the acetylated salicylanilides ranging from 2-60% over two steps, and biologically evaluated. Of the acetylated salicylanilides, 13 displayed higher than 40% reporter-gene inhibition at 50 μM compound concentration.
5.2. SSaalliiccyylliiddeennee aaccyyllhhyyddrraazziiddeess ((ppaappeerr IIII))
A number of salicylidene acylhydrazides that displayed inhibition of T3S had previously been published by Nordfelth et al.53 By close inspection of the structure of the compounds and their corresponding biological activity it was concluded that there was no clear SAR. The aromatic rings of both BB sets tolerated substitution with different functional groups or exchange to heteroaromatic systems or fused aromatics without loss of biological activity. The salicylic aldehydes could also be exchanged for salicylic ethanones, albeit with a small decrease in reporter-gene signal inhibition. Substructure searches of the compound library from the original screening campaign identified a number of compounds in which the salicylic hydroxyl group were
21
lacking or were replaced by an alkyloxy substituent. Those compounds completely lacked T3S inhibition, indicating that the hydroxyl group was of vital importance. We believed that the characterization of the compounds would be of great importance to be able to establish QSAR models. The salicylic aldehyde ring tolerated substitution with both polar and hydrophobic substituents. This led us to believe that the SAR was not dependent on the polarity of the salicylic aldehydes, but perhaps the atomic partial charges of the salicylic aldehyde aromatic carbons. Additionally, electron donating and withdrawing substituents would directly affect the pKa of the salicylic phenol proton. Substructure search for commercially available, and chemically compatible, hydrazides and salicylic aldehydes was performed. Through some of the major commercial sources (Aldrich, Acros, Alfa Aesar, Maybridge, and ABCR), 48 salicylic aldehydes and 92 hydrazides were readily available for ordering. Prior to the computation of molecular descriptors, conformational analysis was performed for each BB set. For the salicylic aldehydes a conformational search was performed using the software OMEGA55 with default settings. For the hydrazides a stochastic conformational search was performed in MOE20. The lowest energy conformations of each BB set were geometry optimized using Hartree-Fock calculations. The pKa of the phenol proton and the atomic partial charges of the aromatic carbon atoms of all salicylic aldehydes were calculated. In addition molecular descriptors describing shape, size, hydrophobicity, surface properties, electronic properties, and partial charges were computed for both BB sets. Based on the previously published structures it appeared that the biological response was more sensitive to the substitution pattern of the salicylic aldehydes and tolerated a greater structural variety of hydrazides. To emphasize some of the ab initio calculated properties, which were believed to be important for T3S inhibition, such as pKa, partial charges, and orbital energies, the MM descriptors describing hydrophobicity and charges were grouped in two separate groups and summarized using PCA. The score vectors from those two models and the other ungrouped variables comprised the descriptor set for the salicylic aldehyde BB set. The design was performed on the BB level and to reduce computational time a selection of products was planned based on the selected BBs. For each BB set a two layer DOOD was computed resulting in 18 hydrazides and 17 salicylic aldehydes. 5-bromo-salicylic aldehyde, a BB that had been used to synthesize several active salicylidene acylhydrazides,53 was added to make the two BB sets of equal size. Each BB was planned to be used three times in
22
the final products, resulting in 54 salicylidene acylhydrazides. By using each selected BB three times the risk of erroneous conclusions about the BBs in subsequent SAR analysis would hopefully be minimized. The two BB sets were listed in random order in two separate columns. Combination of the two columns yielded the first 18 virtual products. The hydrazide column was then shifted one step downwards so that the eighteenth BB became the first and the two columns were combined anew to yield the second set of 18 virtual products. The procedure was repeated to yield the final 18 compounds and combination of the three sets gave the 54 targets for synthesis (figure 5.4).
Figure 5.4. Systematic combination of BBs to yield a set of 54 virtual products where each BB is represented three times.
Out of the 54 target compounds 50 could successfully be synthesized with a purity of at least 95 % and generally more than 98%. One of the hydrazide BBs, butyric acid hydrazide, was unreactive under the reaction conditions used and thus three of the target compounds could not be synthesized. The final failed synthesis was due to problematic purification of the target compound. All compounds were biologically evaluated for reporter-gene inhibition and phosphatase activity originating from secreted YopH was measured. Before starting the QSAR modeling the compounds not specifically targeting T3S had to be removed. Phosphatase activity originating from secreted YopH had been measured and the compounds that inhibited the reporter-gene signal with at least 40% at 50 μM and reduced the YopH activity were classified as active. In addition the inhibition of the reporter-gene needed to be dose-
1234567891011121314
615161718
Sal
icyl
ic a
ldeh
yde
BB
s 1812345678910111213
614151617
Sal
icyl
ic a
ldeh
yde
BB
s 1718123456789101112
613141516
Sal
icyl
ic a
ldeh
yde
BB
s
23
dependent. Bacterial growth experiments were performed for all compounds to verify that the observed activity was not the result of toxicity. No or modest effect on growth was observed. Five compounds were removed from the modeling due to lack of reduction of YopH activity. 18 out of 50 salicylidene acylhydrazides were classified as active.
24
25
6. QSAR modeling of type III secretion
inhibitors
Since there were no published SAR analyses for the acetylated salicylanilides and the salicylidene acylhydrazides, QSAR models were established by first expanding descriptors or PPs to include higher order terms. Subsequent variable selection identified terms that correlated with the investigated response. This section describes the QSAR modeling of the two compound classes.
6.1. SSaalliiccyyllaanniilliiddeess ((ppaappeerr II))
Before starting the QSAR modeling we had to decide whether to perform the modeling on the acetylated or the unacetylated compounds, or both. Since unacetylated salicylanilides had been reported as proton motive force uncouplers,56 we decided to perform all modeling on the acetylated compounds. The acetylated salicylanilides showed an even spread in biological activity at 20 and 10 μM compound concentrations and those inhibitory values were used as response variables. In total 15 acetylated salicylanilides were used in the training set. A number of conventional methods were used in attempts to compute QSAR models. Expansion of the DOOD parameters for the BBs followed by PLS regression and variable selection did not lead to any significant models. The description of the molecules used for the SMDs was only based on 1D and 2D descriptors. A larger number of descriptors including 3D descriptors were computed in an attempt to address the problematic QSAR modeling. A rough conformational search was performed where each bond was rotated 60º and the lowest energy conformation of each BB was further energy minimized using the MMFF94 force field in MOE20. Additional 1D, 2D, and 3D MM and semi-empirical descriptors were computed. Local PCA models for the BBs were computed and the score vectors were extracted and expanded. PLS regression and variable selection did not yield any models with positive Q2 values. Much of the variation found among the BBs apparently did not have any correlation with the response.
26
We needed to identify the variation in the BBs that correlated with the response. To do this PLS regression was performed on the BB level. PLS components were added until R2X reached 1.0 and the score vectors were extracted, combined, and expanded. PLS regression followed by variable selection gave a two component model (Q2 = 0.82). Figure 6.1 schematically illustrates the methodology used, figure 6.2 the observed versus calculated data, and figure 6.3 shows the model coefficients.
Figure 6.1. Schematic representation of the QSAR modeling of the acetylated salicylanilides. X1 and X2 are the matrices of BBs characterized with 1D, 2D, and 3D descriptors. X3 and X4 are the PLS score vectors derived from the BBs. X5 are the square terms of X3, and X6 the square terms of X4. X7 are the interaction terms between X3 and X4. Y is the reporter-gene signal inhibition at 10 and 20 μM compound concentration.
X1 Y
PLS
X2 Y
PLS
Acetylated Salicylic Acids
Anilines
AcetylatedSalicylanilides
Calculation of descriptors
Calculation of descriptors
X3 X4
6 PLS scorevectors
7 PLS scorevectors
X3 X4 X5 X6 X7
PLS
Y
Expansion of linear terms
27
Figure 6.2. Calculated versus experimental data at a) 10 μM and b) 20 μM compound concentrations.
Figure 6.3. QSAR model coefficients for the responses at 10 μM and 20 μM compound concentrations. The models consisted of nine linear terms (grey) and five interaction terms (black).
The compounds displaying a high T3S inhibition generally had oval-shaped, large anilines with large hydrophobic and negatively charges surfaces. The HOMO and LUMO orbital energies were generally low. The salicylic rings of the potent T3S inhibitors had large dipole moments, high hardness, high density, large negative charges, large negatively charged surfaces, and high HOMO orbital energies.
0
20
40
60
80
100
0 20 40 60 80 100Expe
rimen
tal %
repo
rter-g
ene
sign
al in
hibi
tion
Calculated % reporter-genee signal inhibition
6a
7a
1a
18a
16a
14a17a
21a
22a
23a
25a
2a
26a
5a3a
a
0
20
40
60
80
100
0 20 40 60 80 100Expe
rimen
tal %
repo
rter-g
ene
sign
al in
hibi
tion
Calculated % reporter-gene signal inhibition
6a
7a
1a
18a16a
14a 17a
21a
22a
23a
25a
2a
26a
5a3a
b
-0.4
-0.2
-0.0
0.2
0.4
t1_S
alt2
_Sal
t3_S
alt4
_Sal
t1_A
nt2
_An
t5_A
nt6
_An
t7_A
nt1
_Sal
x t6
_An
t2_S
al x
t1_A
nt3
_Sal
x t7
_An
t4_S
al x
t2_A
nt1
_An
xt5
_An
Coef
ficie
nts
of %
inhi
bitio
n (1
0 μM
)
t1_S
alt2
_Sal
t3_S
alt4
_Sal
t1_A
nt2
_An
t5_A
nt6
_An
t7_A
nt1
_Sal
xt6
_An
t2_S
al x
t1_A
nt3
_Sal
xt7
_An
t4_S
al x
t2_A
nt1
_An
xt5
_An
-0.4
-0.2
-0.0
0.2
0.4
Coef
ficie
nts
of %
inhi
bitio
n (2
0 μM
)
28
To get a complementary model that could discriminate between active and inactive acetylated salicylanilides, a PLS-discriminant analysis (PLS-DA) model was computed using the same strategy as outlined above. The luciferase light emission inhibition at 50 μM compound concentration was used as response and compounds were classified as active if displaying at least 40% inhibition. PLS regression on the BB level resulted in 11 score vectors for the anilines and salicylic acids respectively. Combination and expansion of the score vectors followed by PLS regression and variable selection resulted in a one-component PLS-DA model (R2Y = 0.75, Q2 = 0.65). The model showed good separation of the two classes along the direction of the PLS component (figure 6.4). The model was more complex than the PLS QSAR model, consisting of 15 linear terms, 1 square term, and 9 interaction terms (figure 6.5).
Figure 6.4. The PLS-DA model shows separation between the active (boxes) and inactive compounds (open circles) along the PLS score vector.
-6
-4
-2
0
2
4
6
0 2 4 6 8 10 12 14 16 18 20 22 24 26
t[1]
Num
6a
7a
1a
16a
14a
17a
21a
22a
23a25a
2a
5a
3a
10a 9a
8a
11a
13a
12a
18a
15a
19a
20a
24a 26a
2 SD
2 SD
3 SD
3 SD
29
Figure 6.5. Coefficients of the PLS-DA model. The model consists of 15 linear (grey), one square (white), and nine interaction terms (black).
6.2. SSaalliiccyylliiddeennee aaccyyllhhyyddrraazziiddeess ((ppaappeerr IIII))
Of the compound concentrations investigated, 25 μM gave the best spread in inhibition of T3S and the inhibition of the luciferase light emission signal observed at that concentration was therefore selected as response for modeling. The compounds that dose-dependently inhibited the reporter-gene signal with at least 40% at 50 μM and reduced YopH activity were classified as active. According to these criteria, 18 compounds were classified as active. The first attempt to establish a QSAR model was to use the SMD parameters in an effort to establish a linear model using PLS regression. No model could be computed and the data was therefore expanded with square, cubic, and interaction terms. PLS regression and variable selection did not result in any significant model. Much of the variation found in the descriptions of the compounds did not appear to have any correlation with the biological response. The same strategy as outlined for the salicylanilides was then applied. PLS regression at the BB level yielded 14 PLS score vectors for each BB set. The PLS score vectors were combined and expanded with square and interaction terms. PLS regression and variable selection gave a one-component model (Hi-PLS-1, R2Y = 0.67, Q2 = 0.51) that showed an S-shaped correlation between experimental and calculated luciferase signal inhibition (figure 6.6).
-0.2
-0.1
-0.0
0.1
0.2
Sal
_t1
Sal
_t3
Sal
_t4
Sal
_t6
Sal
_t8
Sal
_t10
Sal
_t11
An_
t1
An_
t2
An_
t3
An_
t4
An_
t7
An_
t8
An_
t10
An_
t11
An_
t4 x
An_
t4
Sal
_t1
x A
n_t1
Sal
_t3
x S
al_t
4
Sal
_t6
x S
al_t
10
Sal
_t6
x A
n_t1
0
Sal
_t8
x S
al_t
11
An_
t2 x
An_
t8
An_
t3 x
An_
t4
An_
t7 x
An_
t11
An_
t8 x
An_
t10
Coe
ffici
ents
for
Cla
ss S
epar
atio
n
30
Figure 6.6. Calculated versus experimental % reporter-gene signal inhibition at 50 μM compound concentration of Hi-PLS-1. The data shows curvature, indicating that additional non-linear variables might be needed to get a more linear relationship. The last two numbers from the compound IDs are shown in the plot.
The PLS model was constructed from compounds showing a dose-dependent response and some inactive ones that also were calculated to be inactive in the model. One compound (ME0157) was included in the initial model computation as an inactive, but no models could be computed when that salicylidene acylhydrazide was in the training set. That compound might have been inactive due to, for example, efflux, or poor membrane permeability. This highlights the importance to remove compounds that do not share the same mechanism or are inactive due to other reasons than, for instance, poor affinity to the receptor. Just like the QSAR model of the salicylanilides, the model for the salicylidene acylhydrazides could not predict inactive compounds reliably. Inactive compounds are generally harder to correctly predict, partly since the inactivity can be due to several reasons not related to affinity. To classify inactive and active compounds, a PLS-DA model was computed using the same methodology as outlined in figure 6.1. The entire set of 50 salicylidene acylhydrazides, excluding ME0157 and five compounds that displayed inhibition of the luciferase light emission signal but lacked inhibition of
Hi-PLS-1
0
20
40
60
80
0 20 40 60 80
50
51
52
53
57
59
60
62
64
65
66
6869
72
73
7477
78
80
81
84
94
9697
98Ex
perim
enta
l % r
epor
ter-g
ene
sign
al in
hibi
tion
Calculated % reporter-gene signal inhibition
31
YopH activity, was used as training set. Compounds were classified as active if they displayed a minimum of 40% inhibition of the luciferase light emission signal, otherwise inactive. PLS regression at the BB level yielded 16 PLS score vectors for each BB set. The score vectors were combined and expanded with square and interaction terms. PLS regression and variable selection gave a one-component PLS-DA model (Hi-PLS-DA-1, R2Y = 0.67, Q2 = 0.55) that showed separation of active and inactive compounds along the direction of the PLS score vector (figure 6.7).
Figure 6.7. The PLS-DA-1 model shows good separation of active (boxes) and inactive (circles) compounds along the direction of the PLS score vector. The last two numbers from the compound IDs are shown in the plot.
Hi-PLS-1 and Hi-PLS-DA-1 both displayed reasonable statistics, but were difficult to interpret. Hi-PLS-1 consisted of 33 terms (figure 6.8) while Hi-PLS-DA-1 consisted of 42 terms (figure 6.9). The problematic interpretation stems from the fact that each PLS score vector from the same BB set contains all descriptors but with different weights applied to the individual descriptors.
-6
-4
-2
0
2
4
6
t[1]
Compound ID
50
51
59
6265
66
6869
74
7780
84 98
52 53
5455
56
57
5860
61
64
70
71
72
73
75
7678
79
81
8283
85
86
8788
89
94 95
9697
99
3 SD
2 SD
2 SD
3 SD
32
Figure 6.8. Coefficients for luciferase light emission signal inhibition of Hi-PLS-1. The model consists of 19 linear terms, one square term, and 13 interaction terms. Four out of the 19 linear terms (black) were used for interpretation.
Figure 6.9. Coefficients for separation of active and inactive salicylidene acylhydrazides of Hi-PLS-DA-1. The model consisted of 23 linear terms and 19 interaction terms. Of the 23 linear terms, three were used for interpretation (black).
To get interpretable models an additional strategy was employed. The descriptors calculated for the BB sets were grouped based on the chemical features they described, forming groups for such features as size and hydrophobicity. Those descriptors that did not fit into any group were kept separate. PCA models were computed for each group separately and the PCA score vectors combined with the ungrouped descriptors were used as variables in QSAR modeling. The process is summarized in figure 6.10.
w *c (Hi-PLS-1)
-0.4
-0.2
-0.0
0.2
0.4
SAL2
_t1
SAL2
_t2
SAL2
_t3
SAL2
_t4
SAL2
_t5
SAL2
_t8
SAL2
_t9
SAL2
_t10
SAL2
_t12
SAL2
_t13
SAL2
_t14
HYD
2_t1
HYD
2_t2
HYD
2_t3
HYD
2_t4
HYD
2_t5
HYD
2_t6
HYD
2_t7
HYD
2_t8
SAL2
_t12
x S
AL2_
t12
SAL2
_t1
x SA
L2_t
9
SAL2
_t1
x H
YD2_
t4
SAL2
_t1
x H
YD2_
t5
SAL2
_t2
x SA
L2_t
14
SAL2
_t3
x SA
L2_t
14
SAL2
_t3
x H
YD2_
t7
SAL2
_t4
x SA
L2_t
9
SAL2
_t5
x H
YD2_
t3
SAL2
_t5
x H
YD2_
t8
SAL2
_t8
x H
YD2_
t1
SAL2
_t10
x H
YD2_
t3
SAL2
_t13
x H
YD2_
t6
SAL2
_t14
x H
YD2_
t2
inhi
bitio
n 50
µM
-0.4
-0.2
-0.0
0.2
0.4
SAL1
_t1
SAL1
_t2
SAL1
_t5
SAL1
_t7
SAL1
_t8
SAL1
_t9
SAL1
_t10
SAL1
_t11
SAL1
_t12
SAL1
_t15
SAL1
_t16
HYD
1_t1
HYD
1_t3
HYD
1_t4
HYD
1_t5
HYD
1_t7
HYD
1_t8
HYD
1_t1
0H
YD1_
t11
HYD
1_t1
2
HYD
1_t1
4
HYD
1_t1
5H
YD1_
t16
SAL1
_t2
x SA
L1_t
8
SAL1
_t2
x SA
L1_t
10
SAL1
_t2
x SA
L1_t
16SA
L1_t
5 x
SAL1
_t7
SAL1
_t7
x SA
L1_t
11
SAL1
_t7
x H
YD1_
t1
SAL1
_t7
x H
YD1_
t12
SAL1
_t9
x SA
L1_t
15
SAL1
_t9
x H
YD1_
t11
SAL1
_t10
x H
YD1_
t1
SAL1
_t10
x H
YD1_
t5SA
L1_t
10 x
HYD
1_t1
0
SAL1
_t12
x H
YD1_
t4
SAL1
_t12
x H
YD1_
t15
SAL1
_t15
x H
YD1_
t7SA
L1_t
15 x
HYD
1_t1
0
HYD
1_t3
x H
YD1_
t10
HYD
1_t8
x H
YD1_
t14
HYD
1_t1
2 x
HYD
1_t1
6
Cla
ss 1
(act
ives
)
Cla
ss 2
(ina
ctive
s)
w *c (Hi-PLS-DA-1)
33
Figure 6.10. Establishment of QSAR models based on grouped variables. The descriptors of each BB set were grouped in six groups for the salicylic aldehydes (X11-6) and five for the hydrazides (X21-5). The descriptors that did not fit into any group were kept as separate variables (X1U and X2U). A PCA model was computed for each group of variables (X11-6 and X21-5) and the PCA score vectors were extracted and combined with the ungrouped variables (X1U and X2U), forming the two X-blocks (XSAL and XHYD) used in PLS modeling. XSAL and XHYD were combined and expanded followed by PLS regression and variable selection. Y is the reporter-gene signal inhibition at 25 μM compound concentration.
Using the outlined strategy illustrated in figure 6.10 and the same training set as for Hi-PLS-1, Hi-PLS-2 (figure 6.11, R2Y = 0.69, Q2 = 0.53) was computed. Hi-PLS-2 gives a better correlation between the inactive compounds experimental inhibition and their calculated inhibition than Hi-PLS-1 (figure 6.6). The middle-active compounds are not accurately calculated, especially in Hi-PLS-2. The compounds span 20% to 70% luciferase signal inhibition, while the calculated values in Hi-PLS-2 range from 30% to 55%. The terms constituting Hi-PLS-2 were readily interpretable, but the large number of terms made it impossible to directly translate the model terms into an optimal T3S inhibitor (figure 6.12).
Descriptors for salicylic aldehydes
and hydrazides
Moment ofInertia
Atomic Partial Charges
Size Descriptors
SurfaceDescriptors
ChargeDescriptors
HydrophobicityDescriptors
Moment ofInertia
Size Descriptors
Surface Descriptors
ChargeDescriptors
HydrophobicityDescriptors
Grouping of
variables
Grouping of
variables
Variables that did not fit
into any group
PCA PCA
X11
X12
X13
X14
X15
X16
X21
X22
X23
X24
X25
X1U X2U
PCA score vector extraction and
combination with ungrouped variables
PCA score vector extraction and
combination with ungrouped variables
Combination andexpansion of data
XSAL XHYD
XSAL XHYD XSAL2 XHYD
2 XSALXHYD
PLSY
Salicylic aldehydes Hydrazides
34
Figure 6.11. Experimental versus calculated luciferase signal inhibition plot at 50 μM compound concentration of Hi-PLS-2. The last two numbers from the compound IDs are shown in the plot.
Figure 6.12. Coefficients for luciferase signal inhibition of Hi-PLS-2. The coefficients in black were used for interpretations. The model is highly complex to interpret with its four linear and 19 non-linear terms.
0
20
40
60
80
0 20 40 60 80
5051
52
53
57
59
60
62
64
65
66
6869
72
73
7477
78
80
81
84
94
96
97
98
Expe
rimen
tal %
repo
rter-g
ene
sign
al in
hibi
tion
Calculated % reporter-gene signal inhibition
Hi-PLS-2
-0.4
-0.2
0.0
0.2
SAL_
dipo
le
SAL_
pola
rizab
ility
SAL_
pKa
HYD
_Kie
rFle
x
HYD
_LU
MO
HYD
_Gap
HYD
_pol
ariz
abilit
y
SAL_
shap
e
SAL_
ar_c
harg
es_t
2
SAL_
ar_c
harg
es_t
3
HYD
_sur
face
s_t2
HYD
_hyd
roph
obic
ity
HYD
_cha
rges
_t2
HYD
_siz
e
HYD
_sha
pe
SAL_
surfa
ces_
t2
SAL_
size
SAL_
pola
rizab
ility
x H
YD_s
hape
SAL_
pKa
x H
YD_K
ierF
lex
SAL_
pKa
x H
YD_L
UM
O
SAL_
pKa
x H
YD_G
ap
SAL_
pKa
x H
YD_p
olar
izabi
lity
SAL_
pKa
x H
YD_s
urfa
ces_
t2
SAL_
pKa
x H
YD_c
harg
es_t
2
SAL_
pKa
x H
YD_s
ize
HYD
_Kie
rFle
x x
SAL_
shap
e
HYD
_Kie
rFle
x x
SAL_
ar_c
harg
es_t
2
HYD
_Gap
x S
AL_a
r_ch
arge
s_t2
HYD
_pol
ariz
abilit
y x S
AL_s
hape
HYD
_pol
ariz
abilit
y x S
AL_a
r_ch
arge
s_t2
SAL_
shap
e x
HYD
_hyd
roph
obic
ity
SAL_
shap
e x
HYD
_cha
rges
_t2
SAL_
shap
e x
HYD
_siz
e
SAL_
ar_c
harg
es_t
2 x
HYD
_cha
rges
_t2
SAL_
ar_c
harg
es_t
2 x
HYD
_size
SAL_
ar_c
harg
es_t
3 x
HYD
_hyd
roph
obici
tyin
hibi
tion
50µM
w *c (Hi-PLS-2)
35
A PLS-DA model was computed based on the grouped variables, using the same strategy as for Hi-PLS-2 and the same training set as in Hi-PLS-DA-1. The resulting model, PLS-DA-2, had poor statistics compared to the other three models and was therefore discarded. The interpretation of the models was based on the coefficients of Hi-PLS-2 and some of the important properties identified in Hi-PLS-2 were also found in Hi-PLS-1. The pKa of the phenol in position two of the salicylic aldehydes was the most important term, but it was only prevalent in interaction terms. The electrostatic potential charges on the aromatic carbon atoms of the salicylic aldehydes was the second most important property, again only appearing in interaction terms. The third most important property was the shape of the salicylic aldehyde moiety. Additional attempts to compute models with a more linear relationship between experimental and calculated luciferase signal inhibition were performed. Using the methodologies illustrated in figure 6.1 and figure 6.10, and a logit transformed response, a number of models were computed. None of these models showed improved statistics over the previously described models. Subsequent prediction of an external test set showed that the logit transformed models did not differ in their predictive power and were therefore discarded.
6.3. EEvvaalluuaattiioonn ooff QQSSAARR mmooddeellss uussiinngg eexxtteerrnnaall tteesstt sseettss
6.3.1. External test set for the salicylanilides (paper I)
The models computed for the salicylanilides were used to predict an external test set of 320 compounds. First the PLS-DA model was used to eliminate compounds that were predicted as inactive. The Hi-PLS model was then used to rank the compounds that were predicted as active. Three compounds (27a, 28a, and 29a, table 6.1) were manually selected based on substitution pattern and predicted inhibition of the luciferase light emission signal. 27a was built from BBs not used in the training set. 29a was built from a salicylic acid that had been used only once during synthesis of the training set and an aniline that had been used twice. The BBs constituting 28a had been utilized two and four times respectively. The predicted biological activity of 28a was the most
36
accurate out of the three compounds, likely due to more frequent inclusion of those BBs in the training set. The PLS-DA model could correctly classify the three compounds as active (29a inhibited T3S by 78 % at 50 μM compound concentration). Although the predictions of 27a and 29a were less accurate, the Hi-PLS model could correctly rank the three compounds.
ID Structure 20 μM † 10 μM † PLS-DA 20 μM ‡ 10 μM
‡
27a
76 66 Active 100 ± 0 99 ± 0
28a
64 52 Active 74 ± 1 58 ± 0
29a
56 41 Active 36 ± 2 18 ± 1
Table 6.1. The external test set of acetylated salicylanilides. †Predicted reporter-gene signal inhibition, using the Hi-PLS model. ‡Experimental reporter-gene signal inhibition.
6.3.2. External test set for the salicylidene acylhydrazides
(paper II)
The models of the salicylidene acylhydrazides were used to predict a virtual library of 4416 compounds (all combinations of the characterized BBs). Since the models consisted of a large number of nonlinear terms and the SMD was performed using only linear terms, we anticipated that the models would only be able to predict parts of the virtual library. DModX57 (the residual standard deviation) was used to filter out compounds likely to yield erroneous predictions. Cutoff values were selected, one for Hi-PLS-DA-1 and one for the two Hi-PLS models, that were near the critical distance, D-crit,58 of those models. The overlap between the predictions of the three models was evaluated and it was clear that the three models differed. Since all three models had comparable statistics, we decided to use them in
OO
Cl
NH
OCl
OO
I
NH
OS
F
FF
OO
NH
OBr
Cl
37
consensus. The virtual compounds that remained after filtration based on DModX were predicted in all three models. 327 compounds were predicted as active in all three models and from that set, five were manually selected that all were predicted as active. In addition, three compounds predicted as inactive in all three models were added. The set of eight salicylidene acylhydrazides were synthesized and tested for their ability to inhibit T3S in Yersinia (table 6.2). In addition the compounds were evaluated for their ability to inhibit T3S in the intracellular pathogen Chlamydia trachomatis.59-
64 C. trachomatis was allowed to infect HeLa cells and the compounds were added to investigate the minimum compound concentration needed to completely inhibit C. trachomatis growth that is dependent on a functional T3S system (table 6.3).
ID Structure 100 μM † 50 μM † Hi-PLS-1 Hi-PLS-2 Hi-PLS-DA-1
ME0257 OH
N
HN
OCl
N
96 ± 2 84 ± 2 49 47 Border line
active
ME0258
OH
N
HN
O
20 ± 4 26 ± 3 25 34 Inactive
ME0259
OH
N
HN
OCl
Br
58 ± 5 38 ± 6 73 48 Active
ME0260
OH
N
HN
O
Br
O
71 ± 4 44 ± 2 64 46 Active
ME0261
OH
N
HN
O
Br
39 ± 3 26 ± 4 68 51 Active
ME0262
OH
N
HN
O
Br
O
57 ± 8 22 ± 3 20 31 Inactive
ME0263
OH
N
HN
O
24 ± 6 17 ± 15 26 34 Inactive
ME0264
OH
N
HN
O
O2N
F
20 ± 15 28 ± 10 53 62 Border line active
38
Table 6.2. The experimental and predicted % reporter-gene signal inhibition of the external set of eight salicylidene acylhydrazides. †Means and standard deviations were calculated from triplicates, and experiments were reproduced on at least three separate occasions.
ID Structure Hi-PLS-1
Hi-PLS-2
Hi-PLS-DA-1
Chlamydia MIC (μM)‡
Cell viability 50 μM †
(%)
Cell viability 25 μM †
(%)
ME0257 OH
N
HN
OCl
N
49 47 Border line
active Inactive 67 ± 4 72 ± 1
ME0258
OH
N
HN
O
25 34 Inactive Inactive 79 ± 3 83 ± 2
ME0259
OH
N
HN
OCl
Br
73 48 Active 50 60 ± 1 77 ± 1
ME0260
OH
N
HN
O
Br
O
64 46 Active 25 65 ± 2 73 ± 3
ME0261
OH
N
HN
O
Br
68 51 Active 50 74 ± 3 75 ± 1
ME0262
OH
N
HN
O
Br
O
20 31 Inactive Inactive 33 ± 1 58 ± 2
ME0263
OH
N
HN
O
26 34 Inactive 50 79 ± 1 80 ± 4
ME0264
OH
N
HN
O
O2N
F
53 62 Border line active 50 72 ± 1 75 ± 2
Table 6.3. Biological evaluation of the test set in C. trachomatis. The compounds were also tested for cell toxicity in HeLa cells. ‡Chlamydia MIC was tested in duplicates on at least three separate occasions. †Means and standard deviations were calculated from triplicates, and experiments were reproduced on at least three separate occasions.
The three compounds that were predicted as inactive were inactive in the luciferase assay. Out of the five compounds predicted as active, three were active (ME0257, ME0259, and ME0260). The experimental value of ME0259 was slightly below 40 % at 50 μM, but the inhibition increased with increased compound concentration. The predictions of the three models were compared to the experimental data obtained from the Chlamydia assay. Of
39
the three compounds predicted as inactive, ME0263 completely inhibited Chlamydia growth at 50 μM. Of the five compounds predicted as active, only ME0257 was inactive. Curiously ME0257 was the most potent inhibitor from the external test set against T3S in Yersinia. The differences in the experimental data could possibly be due to differing membrane properties of the two bacterial species and that the Chlamydia assay is performed in the context of eukaryotic target cells. The ability of the compounds to inhibit Chlamydia indicates that the compounds most likely are able to enter the eukaryotic cell and the experiments show that the salicylidene acylhydrazides can clear a Chlamydia infection at the cell level.
6.4. QQSSAARR mmooddeellss uusseedd ttoo ddeevveelloopp aazziiddee ccoonnttaaiinniinngg TT33SS iinnhhiibbiittoorrss ffoorr ttaarrggeett iiddeennttiiffiiccaattiioonn
Upon irradiation with UV light, aryl-azides decompose into short-lived (~1 ns) singlet nitrenes and nitrogen.65 The nitrenes can react with functionalities in a protein, such as, amines and reactive aliphatic and aromatic carbon atoms to form covalent bonds between the aryl-nitrene and the protein. The nitrenes quickly rearrange into dehydroazepines that rapidly react with amines to form robust adducts. T3S inhibitors containing an aryl-azide could potentially be used to identify the biological target of the compounds, and also identify, for instance, what amino acids in the target they interact with. The QSAR models computed for the salicylanilides and the salicylidene acylhydrazides were used to predict the activity of new compounds containing the azide functionality.
6.4.1. Azide containing salicylanilides
For the salicylanilides the T3S inhibition of the two combinations of 4-azido-phenylamine with 2-acetoxy-3,5-diiodo-benzoic acid and 2-acetoxy-5-iodo-benzoic acid were predicted using the Hi-PLS QSAR model (table 6.4). Only the iodine containing salicylic acids were investigated, since the iodines can be replaced with a radioactive isotope of iodine should the compounds be active. A radioactive label facilitates detection and isolation of cross-linked proteins. The two compounds were then synthesized and biologically evaluated.66 Both compounds were predicted as active in the PLS-DA model.
40
The Hi-PLS model gave accurate predictions for 30, while 31 was predicted to have lower activity than the actual experimental value. Even though the predictions for 31 were off target, both models predicted the two compounds as active.
ID Structure 20 μM†
10 μM†
PLS-DA 20 μM‡ 10 μM
‡
30 O
I
I
NH
OON3
63 44 Active 60 31
31 O
I
NH
OON3
43 22 Active 79 83
Table 6.4. Experimental and predicted inhibition of the reporter-gene signal. †Predicted inhibition using the Hi-PLS model. ‡ Experimental inhibition of the luciferase light emission signal.
Future application of these compounds will include target identification in a chemical genetics approach.
6.4.2. Azide containing salicylidene acylhydrazides
An ongoing project within the research group, performed in collaboration with Andrew Roe´s group at the University of Glasgow, aims at identifying target proteins of the compound class in Y. pseudotuberculosis and E. coli. A bacterial lysate from E. coli was treated with a polymer bound salicylidene acylhydrazide and several candidate target proteins specifically bound to the compound could be isolated (data not shown). For one of the putative targets knockouts in E. coli and Y. pseudotuberculosis resulted in a T3S deficient phenotype (data not shown). Crystallography has to date only yielded apo structures but aryl-azide containing salicylidene acylhydrazides could potentially facilitate this project. Through NMR studies, a salicylidene acylhydrazide has been shown to bind the protein (data not shown). The azide containing compounds could be incubated with the bacteria and upon exposure to UV light the azide could form a covalent adduct to the target protein(s). If the compound would be found to be attached to the potential target protein, there would be proof that the salicylidene acylhydrazides enter
41
the bacteria. Exposure of the compound treated protein to UV light could yield a covalent bond between the compound and the protein. If crystals could be obtained, information about the binding site and the bio-active conformation could be revealed. Using the Hi-PLS-1, Hi-PLS-2, and Hi-PLS-DA-1 models, all combinations of 4-azido-benzoic acid hydrazide and the 48 salicylic aldehydes used in the SMD were predicted. Hi-PLS-DA-1 could not reliably predict any of the compounds as their DModX values were very high. Eight compounds were predicted to display at least 50% inhibition of the luciferase light emission signal at Hi-PLS-1 and Hi-PLS-2 (table 6.5). Unfortunately, there was not enough time to synthesize the compounds before the deadline for the thesis completion.
ID Structure Hi-PLS-1
Hi-PLS-2
32
96 61
33
94 58
34
81 55
35
71 53
36
65 55
37
65 54
38
60 54
39
55 56
OH
N
HN
O
N3
NO2
Br
OH
N
HN
O
N3
OHHO
OH
N
HN
O
N3
F
OH
N
HN
O
N3
NO2
OH
N
HN
O
N3
OH
N
HN
O
N3
NO2
OH
N
HN
O
N3
Cl
OH
N
HN
O
N3
Br
F
42
Table 6.5. Predicted reporter-gene inhibition of aryl-azide containing salicylidene acylhydrazides.
6.5. CCoonncclluussiioonnss ffrroomm QQSSAARR mmooddeelliinngg
QSAR models have been established for the acetylated salicylanilides and the salicylidene acylhydrazides, and the models have been evaluated with external test sets. In order to compute QSAR models compounds need to interact with a biological target in the same manner, as mentioned in section 3.1.2. The QSAR models presented in this thesis have, however, been computed from biological data obtained from living bacteria. The models therefore likely capture several mechanisms. This is most likely the reason for the highly nonlinear relationships obtained. For the salicylidene acylhydrazides and the acetylated salicylanilides, much of the variation found among the BBs did not correlate with the biological response, and PLS regression had to be performed on both the BB and product levels in order to establish QSAR models. Importantly the models can be used to predict the activity of new compounds targeting T3S in fully functional bacteria. This eliminates the need to model such factors as membrane permeability separately. While the models may not be easily interpretable, the models have been able to predict activity in complex infection assays. SMD has been of great importance to compute the presented QSAR models. As highlighted by the complexity of the QSAR models computed for the salicylidene acylhydrazides, the detailed characterization of the BBs have proven to be important. SMD gave datasets of compounds that explored the calculated properties at different levels, resulting in a balanced spread in biological response. The QSAR models obtained for the salicylidene acylhydrazides differed in their predictions and it was not possible to rank the models. Several models used in consensus were shown to give useful predictions of activity for the salicylidene acylhydrazides. Consensus predictions of several orthogonal models can be used in place of any single QSAR model and will likely give more accurate predictions.
43
7. Scaffold hopping from a salicylidene
acylhydrazide
The salicylidene acylhydrazides are interesting molecules as probes in chemical biology experiments, such as the identification of target proteins. The compounds, however, suffer from a number of challenges associated with the core scaffold. They are unstable in slightly acidic environments, generally display poor solubility, and offer little structural novelty since salicylidene acylhydrazides have appeared in several commercial screening libraries. Scaffold hopping was used to identify new scaffolds that would circumvent some or all of these issues and hopefully also retain the biological function of the salicylidene acylhydrazides. A virtual salicylidene acylhydrazide was created in MOE20 (figure 7.1a) and a conformational analysis was performed using the program MacroModel67. Only one conformation (QS2, figure 7.1c) was identified within 10 kJmol-1 of the global minima conformation (QS1, figure 7.1b). The identified conformations were both entirely planar and were used as query scaffolds. The phenyl substituent was removed from the identified conformations and replaced with an attachment point to yield a third query scaffold (QS3, figure 7.1d).
Figure 7.1. The structures used for the scaffold hopping. R is an attachment point defined in SHOP, where parallel chemistry can be used to introduce structural variation. All structures were planar. a) The salicylidene acylhydrazide used as basis for the scaffold hopping. b) The global minima conformation identified in MacroModel. c) The second low energy conformation identified in MacroModel. d) A query scaffold obtained by replacement of the phenyl with an attachment point in any of the two low energy conformations.
The three query scaffolds were used as input in the program SHOP39 and three separate searches were performed. Since we did not have any information about the target or the compounds bioactive conformation, the
OH
N
HN
O
O
N
HNR
O
HO
NNH
R
O HR N
H
ON Ra) b) c) d)
QS1 QS2 QS3
44
scaffold search in SHOP was run at default settings. The reference database, containing 10,556 scaffolds and 124,317 conformers, was built using the Maybridge building block collection from 2005 and the virtual reaction manager of SHOP.40 The reaction manager is used to construct virtual libraries from known reactions and commercially available BBs. The scaffold hopping search from QS1 gave 2-(2-amino-6-phenyl-pyrimidin-4-yl)-2,2-difluoro-1-phenyl-ethanol (S1) the highest similarity score and the compound was selected for synthesis (figure 7.2a). Scaffold hopping from QS2 did not result in any core structures with a good shape overlap with the query scaffold. A number of interesting scaffolds were identified in SHOP based on QS3, some of which we intended to explore by synthesis of a small number of analogs (figure 7.2b).
Figure 7.2.The scaffolds identified in SHOP that were targeted for synthesis. a) The fluoroethanol S1 was the highest ranked hit from the scaffold hop from QS1. b) S2 and S3 were the highest ranked hits from the scaffold hop from QS3, while S4 was among the ten highest ranked scaffolds. Scaffold S5 was not included in the maybridge database but was recognized as an analog series to S4 and was selected for synthesis in place of S4.
The thiophene scaffold (S4) from the search based on QS3 was manually replaced with a thiazole scaffold (S5) as we believed S5 to show more structural resemblance to the salicylidene acylhydrazones. S5 was not included in the Maybridge database.
7.1. 22--((22--AAmmiinnoo--ppyyrriimmiiddiinn--44--yyll))--22,,22--ddiifflluuoorroo--11--((pphheennyyll))--eetthhaannoollss ((ppaappeerr IIIIII))
A substructure search in SciFinder on 2-(2-amino-pyrimidin-4-yl)-2,2-difluoro-1-(methyl)-ethanol (figure 7.3) did not yield any hits (performed 2010-01-15), indicating that the compound class would potentially solve the novelty issue. The first retrosynthetic analysis we performed identified 2,2-difluoro-3-hydroxy-3-phenyl-propionaldehyde as a precursor to target compound S1 (scheme 7.1).
N N
NH2
F F
OHa)
S1
b) NH
RR
O
N
RR
O
SR
R
O
S2 S3 S4
S
NR
O
R
S5
45
Figure 7.3. The structure used for a substructure search in SciFinder.
Scheme 7.1. First retrosynthetic analysis of target compound S1. A four step synthesis of the aldehyde was planned (scheme 7.2). Using the same conditions as reported by Kumadaki et al.68 ester 40a was synthesized in 80% yield. The secondary alcohol was protected using TBDMSOTf (41) and 2,6-lutidine in DMF solution in an ice bath overnight in 70% yield. Deprotonation of the alcohol with sodium hydride and protection of the secondary alcohol using benzyl iodide or methyl iodide was attempted. The reactions were attempted at both room temperature and under microwave heating at 150 ºC, but no product was obtained. Protection of the alcohol using TBDMSOTf in DMF solution after deprotonation with sodium hydride under cooling in an ice bath gave several byproducts. Exchanging sodium hydride with 2,6-lutidine gave the protected TBDMS alcohol in 70% yield. The ester was then reduced to the primary alcohol using sodium borohydride at room temperature for 3 h to give 42 in 90% yield. Oxidation of the primary alcohol to an aldehyde was then attempted using both Swern and Dess-Martin oxidations but we did not manage to oxidize the primary alcohol (scheme 7.2).
N N
NH2
F F
OH
N N
NH2
OH OH
F F
O
FF
OH
F F
O
EmmonsReaction
Horner-Wadsworth-
S1
46
Scheme 7.2. Attempted synthesis of the aldehyde.
Since the primary alcohol could not be oxidized under the given conditions, further synthesis towards S1 was planned from the ester 41. The TBDMS protected secondary alcohol was used in the next synthesis. We envisioned a three step procedure starting from 41 involving synthesis of the dione, cyclocondensation with guanidine followed by deprotection of the TBDMS group (scheme 7.3).
Scheme 7.3. A second retrosynthetic analysis identified compound 41 as a precursor to S1 in a three step procedure.
Acetophenone was deprotonated using sodium hydride in THF solution under nitrogen atmosphere at 0 ºC for 1 h. Compound 41 was added to the mixture which resulted in formation of several unidentified byproducts. The decomposition was most likely due to deprotonation of the benzylic carbon which would result in elimination of a fluorine atom and formation of a reactive alkene. The released fluorine would likely deprotect the TBDMS group. Compound 41 was therefore not used for any further attempts to synthesize S1, since the secondary alcohol likely needed to be unprotected to avoid formation of the reactive alkene. A third retrosynthetic analysis identified 4,4-difluoro-5-hydroxy-1,5-diphenyl-pent-1-yn-3-one as a key intermediate. To avoid overalkynylation of ester 40a, which was anticipated to be a problem, the synthesis was planned via a Weinreb amide (figure 7.4).
O O
OBr
F F
MeCN
THF O
F F
SiOH
OH
F F
O
O
O
F F
SiO
DMF O
F F
O
O
Si
+ Wilkinson´scatalystRT, 24h
TBDMSOTf2,6-lutidine
NaBH4RT, 3h
40a (80%) 41 (70%)
42 (90%)
N N
NH2
OH
FF
O
F F
O OSi
O
F F
O
O
Si O
S1 41
47
Scheme 7.4. The Weinreb amide was identified as a possible precursor to S1. The synthesis of S1 (45a) and analogs is summarized in scheme 7.5. To a solution of 40a in THF, NaOH (aq., 1M) was added and the mixture was stirred at r.t. for 1 h. The carboxylic acid was isolated after acidic workup and extraction with EtOAc, and used directly in the next step. The acid chloride was formed in situ by the addition of oxalylchloride and catalytic amount of DMF to the carboxylic acid in DCM. N,O-dimethyl-hydroxylamine hydrochloride and TEA was added and the mixture was stirred at rt overnight to produce 43a in 79% yield. Three equivalents of lithiumphenylacetylide was added to a solution of 43a in THF under nitrogen atmosphere at 0 ºC. 43a was quickly dialkynylated and the reaction was repeated at -78 ºC for 3 h. After workup a crude oil of 44 with a purity of ca 80% was isolated. Flash chromatography did not remove the impurities and some material was lost on the silica column. The crude product was cyclized directly using guanidine hydrochloride, K2CO3, and MeCN according to Tomkinson and co-workers´ protocol69 to produce the target molecule 45a (S1) in 61% yield. A number of analogs to 45a were targeted for synthesis (scheme 7.5). The choice of aldehydes was chosen to mimic the hydrazides found in active compounds, identified primarily in the designed set of salicylidene acylhydrazides. To avoid the extra step involving hydrolysis of the ester to carboxylic acid, direct transformation of the ester to a Weinreb amide was for some compounds performed through activation of the ester with dimethylaluminum chloride. The products obtained after alkynylation of 43a-d proved to be unstable and were therefore used in the final synthetic step directly after workup. Li2CO3 and Na2CO3 were used for some cyclization reactions but the yield was not affected by the choice of base.
N N
NH2
OH
FF
S1
OH
F F
O
OH
N
O
F F
O
Li
48
Scheme 7.5. Synthetic procedure for 2-(2-Amino-6-phenyl-pyrimidin-4-yl)-2,2-difluoro-1-phenyl-ethanols.
The compounds were evaluated for their ability to inhibit the luciferase light emission signal at 50 and 200 μM compound concentrations, using a previously described protocol.53 45e and 45n showed a slight effect (~25% inhibition) on the light emission signal at 200 μM concentration and were tested at 1 mM, 500, and 250 μM concentrations. The compounds displayed ca 40% inhibition of the luciferase light emission at 1 mM which was too low to warrant further biological evaluation in cell-based assays. The compounds were also tested for bacterial growth inhibition at these concentrations and slight effect was detected at 1 mM while no effect was observed at 500 μM or lower. A substructure search in SciFinder (performed 2010-03-02) of the scaffold indicates that they are novel as no publications with analogs to S1 have been found. The compounds do not inhibit T3S according to the criteria that were set for the salicylidene acylhydrazides and acetylated salicylanilides (minimum 40% reporter-gene signal inhibition at 50 μM compound concentration). The compounds display high aqueous solubility and are generally more polar than the salicylidene acylhydrazides. This may result in lower membrane permeability for the compounds, which might be one reason that the compounds are not active against T3S. The compounds are drug like,
R1 = HR1 = HR1 = HR1 = HR1 = HR1 = m-OMeR1 = m-OMeR1 = m-OMeR1 = m-CF3
R1 = m-CF3
R1 = m-CF3
R1 = m-CF3
R1 = o-FR1 = o-FR1 = o-F
45a)45b)45c)45d)45e)45f)45g)45h)45i)45j)45k)45l)45m)45n)45o)
R2 = PhR2 = n-propylR2 = cyclopropylR2 = CH2NMe2
R2 = HR2 = PhR2 = n-propylR2 = cyclopropylR2 = PhR2 = n-propylR2 = cyclopropylR2 = HR2 = PhR2 = n-propylR2 = H
(26%)(23%)(11%)(<1%)(8%)(25%)(12%)(11%)(42%)(19%)(24%)(28%)(9%)(3%)(12%)
2) CH3NHOCH3 * HCl
1) oxalylchlorideDMF, DCM
TEA, RT, 14h
THF, -78 oC, 3h
H
O
O
OBr
F F
Et2Zn, MeCN
OH NN
R2F F
NH2
H2N NH2
NH HCl
OH
F FO
ONaOH, THF
1) Me2AlCl, DCM
OH
F F
O
R2
OH
F FOH
O
Li
R2
OH
N
O
FF
O
Wilkinson´s catalyst,RT, 24h
+
base, MeCN,MWI, 120 oC, 20 min
2) CH3NHOCH3 * HClRT, 14h
40a) R1 = H (80%)40b) R1 = m-OMe (59%)40c) R1 = m-CF3 (80%)40d) R1 = o-F (60%)
40a-d
43a) R1 = H (79%)43b) R1 = m-OMe (68%)43c) R1 = m-CF3 (75%)43d) R1 = o-F (62%)
43a-d45a-oR2 = Ph n-propyl cyclopropyl CH2NMe2 TMS
R1
R1
R1R1
R1R1
44R1 = H, R2 = Ph (43%)
RT, 20 min
49
as they all fulfill Lipinski´s rule of five5. Salicylidene acylhydrazides have been reported to be active against a wide range of biological systems,70-81 and the fluoroethanols (45a-o) may prove to be active in other systems than T3S.
7.2. TThhiiaazzoolleess ((ppaappeerr IIVV))
Compounds with the thiazole scaffold could be synthesized using four established reactions. BB selection was based on features found in active salicylidene acylhydrazides that were previously published53. Suzuki-coupling of 4-bromo-thiazole-2-carbaldehyde with 2-methoxy-phenyl-boronic acids gave the 4-(2-methoxy-phenyl)-thiazole-2-carbaldehydes (46a-46d) in yields ranging from 57-67%. Alkylation of the aldehydes using substituted aryl-Grignards gave the secondary alcohols (47a-47i) in yields ranging from 35-89%. Dess-Martin oxidation produced the ketones (48a-48i) in yields ranging from 53-86%. Finally, demetylation using boron-tribromide was performed, but the yields were generally low (6-50%) as the compounds decomposed during the reaction. Ten target compounds (49a-49j), with a purity of at least 95% according to 1H NMR and LCMS, were synthesized (scheme 7.6).
50
Scheme 7.6. Synthesis of substituted thiazols.
All the synthesized compounds, including the intermediates, were biologically evaluated in the reporter-gene assay. Compound 49i displayed 32-48% reporter-gene signal inhibition at the four tested concentrations from 12.5 up to 100 μM, indicating poor solubility of the compound. Growth experiments showed that the compound did not inhibit growth at 12.5 μM, where it had significant effect on the reporter-gene signal. Even though the compound has rather low solubility, 49i displays potency comparable to the more potent salicylidene acylhydrazides at 12.5 μM. Further development of this compound could focus on solubility that might be due to pi-stacking of the compound. Introduction of flexible substituents, such as benzyl substituents in place of the phenyl groups, might increase solubility. Since the SAR of the salicylidene acylhydrazides indicated several nonlinearities between the two BB sets, entirely different substituents than phenyl groups might be used in order to minimize the risk that compounds are inactive due to unfavorable combinations of BBs.
S
NBr
O
(HO)2B
O
R1
R2
O
R1
R2
N
S
HO
R3
HO
R1
R2
N
S
O
R3
O
R1
R2
N
S
O
R3
Cs2CO3
O
R1
R2
N
S
O+
4 eq. R3-MgCl
46a)46b)46c)46d)
R1 = BrR1 = HR1 = ClR1 = OMe
R2 = HR2 = OMeR2 = HR2 = H
47a)47b)47c)47d)47e)47f)47g)47h)47i)
R1 = BrR1 = HR1 = ClR1 = OMeR1 = BrR1 = BrR1 = BrR1 = BrR1 = Br
R2 = HR2 = OMeR2 = HR2 = HR2 = HR2 = HR2 = HR2 = HR2 = H
R3 = phenylR3 = phenylR3 = phenylR3 = phenylR3 = p-methyl phenylR3 = p-trifluoromethyl phenylR3 = p-chloro phenylR3 = 2-thiophenylR3 = p-methoxy phenyl
48a)48b)48c)48d)48e)48f)48g)48h)48i)
49a)49b)49c)49d)49e)49f)49g)49h)49i)49j)
R1 = BrR1 = HR1 = ClR1 = OMeR1 = BrR1 = BrR1 = BrR1 = BrR1 = Br
R2 = HR2 = OMeR2 = HR2 = HR2 = HR2 = HR2 = HR2 = HR2 = H
R3 = phenylR3 = phenylR3 = phenylR3 = phenylR3 = p-methyl phenylR3 = p-trifluoromethyl phenylR3 = p-chloro phenylR3 = 2-thiophenylR3 = p-methoxy phenyl
R1 = BrR1 = HR1 = ClR1 = BrR1 = BrR1 = BrR1 = BrR1 = BrR1 = BrR1 = OH
R2 = HR2 = OHR2 = HR2 = HR2 = HR2 = HR2 = HR2 = HR2 = HR2 = H
R3 = phenylR3 = phenylR3 = phenylR3 = p-methyl phenylR3 = p-trifluoromethyl phenylR3 = p-chloro phenylR3 = 2-thiophenylR3 = p-methoxy phenylR3 = p-hydroxy phenylR3 = phenyl
(61%)(61%)(57%)(67%)
(82%)(80%)(80%)(43%)(83%)(35%)(89%)(79%)(83%)
(86%)(100%)(90%)(53%)(52%)(84%)(97%)(27%)(74%)
(8%)(50%)(22%)(9%)(15%)(14%)(50%)(6%)(45%)(41%)
15 mol% Pd(PPh3)4
1,4-dioxane,reflux
THF, 0 oC to r.t.
2 eq. DMP
CH2Cl2, r.t.
5 eq. BBr3
CH2Cl2, -78 oC
51
7.3. OOtthheerr ssccaaffffoollddss
Compounds with scaffold S2 (figure 7.2) could be synthesized from 2-(2-methoxy-phenyl)-ethylamine and a carboxylic acid, followed by demetylation of the methoxy-group, to yield N-[2-(2-hydroxy-phenyl)-ethyl]-amides. This compound class was only briefly investigated as only one target compound, 51, was synthesized (scheme 7.7).
Scheme 7.7. Synthesis of N-[2-(2-hydroxy-phenyl)-ethyl]-amides.
Demetylation with boron-tribromide gave poor yield but since the synthesis of the compound only involved two steps, enough material for biological evaluation could be obtained. Analog compounds can be synthesized using other carboxylic acids. Additionally the phenol or the methyl protected phenol can be brominated to yield substituted phenols that are similar to those phenols frequently found in biologically active salicylidene acylhydrazides, such as ME0168, ME0169, and ME0260 (see table 1 in paper II). The linker between the phenol and the amide bond can also be modified, allowing a greater variation of phenolic BBs to be introduced in analog compounds. Compounds with the core structure S3 (figure 7.2) can be readily synthesized starting from the same functionalized pyridine as in the synthesis of 50. Amide coupling of the carboxylic acid would in one step introduce functionalities that could mimic properties of the hydrazide part of the salicylidene acylhydrazides. The formation of a Weinreb-amide would allow subsequent functionalization via, for instance, Grignard or alkyllithium reagents to introduce a ketone functionality. Subsequent Suzuki-coupling would introduce the phenol substituent directly or the methyl protected phenol that upon demetylation would yield the target compounds (scheme 7.8).
N
BrOH
O
O
H2N N
BrNH
OO
N
BrNH
OHO1) oxalylchloride,
DMF, CH2Cl2
TEA, r.t., 14 h50 (90%)
CH2Cl2,BBr3, r.t., 14 h
51 (8%)2)
52
Scheme 7.8. Examples of some reactions that can be used to synthesize compounds with core structure S3 in 2-4 steps.
Compound 54 was synthesized in three steps (scheme 7.9). Again poor yields were obtained after demetylation with boron-tribromide. Other demetylation reagents could be used for those compounds not having halide substituents, such as the industrial method using 1-dodecanethiol and NaOH at high temperature, reported by Chae.82
Scheme 7.9. Synthesis of 54, a compound with the S3 scaffold.
The compounds have not yet been biologically evaluated. If these compound classes are to be further investigated, small libraries of compounds (>10) with the S2 or S3 scaffold should be synthesized. BBs that mimic parts of active salicylidene acylhydrazides, to increase the chance of finding active compounds, should be chosen for this purpose.
N
HO
OBr
N
N
OO Br
N
R1
OBr
N
R1
OR3O
N
R1
OHO
R2
R2
Demetylation
Suzuki-coupling
Weinreb-amideformation
Target compoundswith S3 core
Amidecoupling
Alkylation orarylation
N
HO
OBr
NH2
N
NH
O
Br
HO
N
NH
OBr
O
(HO)2B Br
N
NH
O
Br
O1) oxalylchloride,DMF, CH2Cl2
TEA, r.t., 5 h
2)
52(82%)
Potassium triphosphate,PEPPSI-iPr,Dioxane/water 4:1MWI, 110 oC, 20 min
CH2Cl2,BBr3, r.t., 14 h 54 (19%)
53 (59%)
53
8. Concluding remarks
This thesis describes the development of T3S inhibitors that were first identified in biological screening of a commercially available compound library. SMD and synthesis facilitated the computation of QSAR models for two classes of compounds, the salicylidene acylhydrazides and the acetylated salicylanilides. The QSAR models were evaluated with external test sets and these models can be of great utility in designing new compounds to be used as research tools that can be used to elucidate further information about the function and regulation of T3S. Scaffold hopping was used to identify alternatives to the salicylidene acylhydrazides and two scaffolds were investigated through the synthesis of libraries of analogs. The 2-(2-amino-pyrimidine)-2,2-difluoro-ethanols were novel in that the scaffold of these compounds appears to not been published previously. The compounds did not inhibit T3S, but may prove to be useful in other medicinal chemistry projects, considering the large number of applications that exist for the salicylidene acylhydrazides. The synthesis of thiazole compounds resulted in one active T3S inhibitor. This compound appeared to have low aqueous solubility and possible future developments might include the optimization of solubility and potency. Several interesting scaffolds were identified in the scaffold hopping search that has not been investigated or presented in this thesis. Future research within the research group may be to pursue these other scaffolds or perform new scaffold hopping searches when additional information about the mode of action becomes available. In addition to scaffold hopping on the existing databases, the virtual reaction manager in SHOP may be used to construct additional core structures from new building blocks sold through commercial vendors, which may be added to the old reference databases. If any new core structures are identified as T3S inhibitors, SMD and QAR modeling can be applied to further develop them, as described for the salicylidene acylhydrazides and the acetylated salicylanilides.
54
55
9. Acknowledgements
Det finns flera personer jag skulle vilja tacka i samband med att detta sista kaptiel i avhandlingen skrivs. Min handledare Mikael Elofsson som alltid tagit sig tid att diskutera saker, även utanför utsatta tider. Ett stort tack för att du öppnade mitt intresse för organisk syntes och att du ständigt de senaste åren med framgång övertalat mig till att göra en postdoc, trots att jag länge var motsträvig. Du har alltid, trots kort varsel, tagit dig tid till att skriva rekomendationsbrev. Vi har även frekvent på senare tid kört fredagsstänkare och vi har alltid kunnat diskutera relevanta saker som film, musik, alkohol, mat och krigsspel. Min biträdande handledare, Anna Linusson Jonsson, som hjälpt mig med beräkningskemin och alltid haft bra kommentarer, förslag och synpunkter på mina arbeten och manuskript. Christoffer Bengtsson, som nästan alltid kan rekommendera bra syntesmetoder när det har varit motsträvigt i syntesarbetet. Marcus Carlsson, som alltid bidragit till att göra våra fester roliga. Tack särskilt för att du kristalliserade en av mina acetylerade salicylanilider! Dan Johnels och Bertil Eliasson, som ofta tagit sig tid till att diskutera syntes och NMR problem. Mina trogna fest kompisar Henrik, Thomas och John (även om festerna numera är sällsynta och ofta slutade med att Marcus jagade mig med Johns machete svärd). Caroline, Åsa och Anna K. för hjälp med all biologi och diskussioner på ett område jag inte behärskat. Jonas Eriksson som på senare tid varit min trogne lunchkompis. David Andersson, som ofta bidragit med kritiskt granskande av mina idéer och manuskript m.m.
56
Christopher Öberg och Mikael Hillgren för syntes diskussioner och ett gott samarbete i samband med scaffold hopping substanserna. Tack även för ert deltagande i fredagsstänkarna. Mina forna exjobbare Erika och Pär som slitit med fluoroetanolerna. Joel för arbetet på de azid innehållande salicylaniliderna. Öjvind Davidsson, min forne gruppchef på AstraZeneca, för rekommendationsbrev och god handledning på företaget de två år jag var där. Per-Anders för roliga samtalsämnen. Tack även i förskott för biologi utvärderingen av mina föreningar. Maciej, min gamla kontorskompis som fick utstå några skämt om bärplockning. Sara, Sussi och Weixing för att ha bidrgait till roliga grupp aktiviteter, möten och en trevlig stämning i gruppen. Pia, Charlotta och Ulrik på Creative Antibiotics för hjälp med assays. Creative Antibiotics för finansiering sista året, samt inbjudan till gruppmöten och julbord på officersmässen. Anton för diskussioner kring manuskript och beräkningsmetoder. Mattias Hedenström för hjälp med NMR relaterade saker och delat intresse i fantasy litteratur. Henrik Antti för förslag på hur jag skulle enumrera salicylidene acylhydraziderna. Whiskeyklubben för att ha introducerat mig till nya whiskeysorter och möjligheten att testa Haggis. Magnus Sellstedt för syntes och NMR diskussioner. Andreas Larsson och Fredrik Almqvist för organisering av möten m.m. Sofie, Lotta och Knatti, bl.a. för väsentliga krog besök.
57
Min familj för moraliskt stöd, även om skepticismen funnits där. Och extra tack till alla som läst avhandlingen! Särskilt tack till Christopher som upptäckte ett stort fel i avhandlingstiteln som alla andra, inklusive undertecknad missade.
58
59
10. References
1. Burbaum, J. J.; Sigal, N. H. New technologies for high-throughput screening. Curr. Opin. Chem. Biol. 1997, 1, 72-78. 2. Jorgensen, W. L. The many roles of computation in drug discovery. Science 2004, 303, 1813-1818. 3. Shoichet, B. K.; McGovern, S. L.; Wei, B.; Irwin, J. J. Lead discovery using molecular docking. Current Opinion in Chemical Biology 2002, 6, 439-446. 4. Taylor, R. D.; Jewsbury, P. J.; Essex, J. W. A review of protein-small molecule docking methods. Journal of Computer-Aided Molecular Design 2002, 16, 151-166. 5. Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews 2001, 46, 3-26. 6. Hansch, C. A quantitative approach to biochemical structure-activity relationships. Acc. Chem. Res. 1969, 2, 232-239. 7. Linusson, A.; Gottfries, J.; Olsson, T.; Örnskov, E.; Folestad, S.; Norden, B.; Wold, S. Statistical molecular design, parallel synthesis, and biological evaluation of a library of thrombin inhibitors. J. Med. Chem. 2001, 44, 3424-3439. 8. Lopez-Rodriguez, M. L.; Morcillo, M. J.; Fernandez, E.; Rosado, M. L.; Pardo, L.; Schaper, K. J. Synthesis and structure-activity relationships of a new model of arylpiperazines. 6. Study of the 5-HT1A/alpha(1)-adrenergic receptor affinity by classical Hansch analysis, artificial neural networks, and computational simulation of ligand recognition. Journal of Medicinal Chemistry 2001, 44, 198-207. 9. Larsson, A.; Johansson, S.; Pinkner, J. S.; Hultgren, S. J.; Almqvist, F.; Kihlberg, J.; Linusson, A. Multivariate design, synthesis, and biological evaluation of of peptide inhibitors of FimC/FimH protein-protein interactions in uropathogenic Escherichia coli. J. Med. Chem. 2005, 48, 935-945. 10. Holm, L.; Frech, K.; Dzhambazov, B.; Holmdahl, R.; Kihlberg, J.; Linusson, A. Quantitative structure-activity relationship of peptides binding to the class II major histocompatibility complex molecule A(q) associated with autoimmune arthritis. J. Med. Chem. 2007, 50, 2049-2059. 11. Dahlgren, M. K.; Kauppi, A. M.; Olsson, I.-M.; Linusson, A.; Elofsson, M. Design, synthesis, and multivariate quantitative structure-activity relationship of salicylanilides potent Inhibitors of type III secretion in Yersinia. J. Med. Chem. 2007, 50, 6177-6188.
60
12. Martin, E. J.; Blaney, J. M.; Siani, M. A.; Spellmeyer, D. C.; Wong, A. K.; Moos, W. H. Measuring Diversity: Experimental Design of Combinatorial Libraries for Drug Discovery. J. Med. Chem. 1995, 38, 1431-1436. 13. Linusson, A.; Wold, S.; Norden, B. Statistical molecular design of peptoid libraries. Mol. Diver. 1999, 4, 103-114. 14. Linusson, A.; Gottfries, J.; Lindgren, F.; Wold, S. Statistical molecular design of building blocks for combinatorial chemistry. J. Med. Chem. 2000, 43, 1320-1328. 15. Box, G. E. P.; Hunter, W. G.; Hunter, J. S. Statistics for experimenters. An introduction to design, data analysis, and model building. John Wiley & Sons, Inc.: New York, 1978. 16. Lundstedt, T.; Seifert, E.; Abramo, L.; Thelin, B.; Nyström, Å.; Pettersen, J.; Bergman, R. Experimental design and optimization. Chemom. Intell. Lab. Sys. 1998, 42, 3-40. 17. Johnson, M. E.; Nachtseim, C. J. Some guidelines for constructing exact D-optimal designs on convex spaces. Technometrics 1983, 25, 271-277. 18. deAguiar, P. F.; Bourguignon, B.; Khots, M. S.; Massart, D. L.; PhanThanLuu, R. D-optimal designs. Chemom. Intell. Lab. Syst. 1995, 30, 199-210. 19. Olsson, I. M.; Gottfries, J.; Wold, S. D-optimal onion designs in statistical molecular design. Chemom. Intell. Lab. Syst. 2004, 73, 37-46. 20. MOE. Molecular Operating Environment (MOE), Version 2005.06, Chemical Computing Group, Inc., Montreal, Quebec, Canada (2005). 21. DRAGON, Todeschini, R.; Consonni, V.; Mauri, A.; Pavan, M.: Milano, Italy. 22. Shao, Y.; Molnar, L. F.; Jung, Y.; Kussmann, J.; Ochsenfeld, C.; Brown, S. T.; Gilbert, A. T. B.; Slipchenko, L. V.; Levchenko, S. V.; O'Neill, D. P.; Jr, R. A. D.; Lochan, R. C.; Wang, T.; Beran, G. J. O.; Besley, N. A.; Herbert, J. M.; Lin, C. Y.; Voorhis, T. V.; Chien, S. H.; Sodt, A.; Steele, R. P.; Rassolov, V. A.; Maslen, P. E.; Korambath, P. P.; Adamson, R. D.; Austin, B.; Baker, J.; Byrd, E. F. C.; Dachsel, H.; Doerksen, R. J.; Dreuw, A.; Dunietz, B. D.; Dutoi, A. D.; Furlani, T. R.; Gwaltney, S. R.; Heyden, A.; Hirata, S.; Hsu, C.-P.; Kedziora, G.; Khalliulin, R. Z.; Klunzinger, P.; Lee, A. M.; Lee, M. S.; Liang, W.; Lotan, I.; Nair, N.; Peters, B.; Proynov, E. I.; Pieniazek, P. A.; Rhee, Y. M.; Ritchie, J.; Rosta, E.; Sherrill, C. D.; Simmonett, A. C.; Subotnik, J. E.; Iii, H. L. W.; Zhang, W.; Bell, A. T.; Chakraborty, A. K. Advances in methods and algorithms in a modern quantum chemistry program package. Physical Chemistry Chemical Physics 2006, 8, 3172-3191. 23. Jaguar. version 6.5, Schrödinger, LLC, New York, NY, 2006.
61
24. Gaussian 98, Revision A.9; M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, V. G. Zakrzewski, J. A. Montgomery, Jr., R. E. Stratmann, J. C. Burant, S. Dapprich, J. M. Millam, A. D. Daniels, K. N. Kudin, M. C. Strain, O. Farkas, J. Tomasi, V. Barone, M. Cossi, R. Cammi, B. Mennucci, C. Pomelli, C. Adamo, S. Clifford, J. Ochterski, G. A. Petersson, P. Y. Ayala, Q. Cui, K. Morokuma, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. Cioslowski, J. V. Ortiz, A. G. Baboul, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. Gomperts, R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, J. L. Andres, C. Gonzalez, M. Head-Gordon, E. S. Replogle, and J. A. Pople, Gaussian, Inc.: Pittsburgh PA, 1998. 25. Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Sys. 1987, 2, 37-52. 26. Jackson, J. E. A user's guide to principal components analysis. Wiley: New York, 1991. 27. Eriksson, L.; Johansson, E. Multivariate design and modeling in QSAR. Chemom. Intell. Lab. Sys. 1996, 34, 1-19. 28. Wold, S.; Josefson, M.; Gottfries, J.; Linusson, A. The utility of multivariate design in PLS modeling. J. Chemom. 2004, 18, 156-165. 29. Wold S, R. A., Wold H, Dunn W I. The collinearity problem in linear regression. The partial least squares approach to generalized inverses. SIAM J. Sci. Stat. Comp 1984, 5, 735-743. 30. Wold, S.; Sjostrom, M.; Eriksson, L. PLS-regression:a basic tool of chemometrics. Chemom. Intell. Lab. Sys. 2001, 58. 31. Davies, M. Multiple linear-regression analysis with adjustment for class-differences. J. Am. Stat. Assoc. 1961, 56, 729-&. 32. Wold, S.; Antti, H.; Lindgren, F.; Öhman, J. Orthogonal signal correction of near-infrared spectra. Chemometrics and Intelligent Laboratory Systems 1998, 44, 175-185. 33. Trygg, J.; Wold, S. Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics 2002, 16, 119-128. 34. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273-297. 35. Schü ü rmann, G.; Ebert, R.-U.; Chen, J.; Wang, B.; Kühne, R. External Validation and Prediction Employing the Predictive Squared Correlation Coefficient î—¸ Test Set Activity Mean vs Training Set Activity Mean. Journal of Chemical Information and Modeling 2008, 48, 2140-2145. 36. Bergmann, R.; Linusson, A.; Zamora, I. SHOP: Scaffold HOPping by GRID-based similarity searches. Journal of Medicinal Chemistry 2007, 50, 2708-2717.
62
37. Jenkins, J. L.; Glick, M.; Davies, J. W. A 3D Similarity Method for Scaffold Hopping from Known Drugs or Natural Ligands to New Chemotypes. Journal of Medicinal Chemistry 2004, 47, 6144-6159. 38. Zhang, Q.; Muegge, I. Scaffold Hopping through Virtual Screening Using 2D and 3D Similarity Descriptors:  Ranking, Voting, and Consensus Scoring. Journal of Medicinal Chemistry 2006, 49, 1536-1548. 39. SHOP, 1.0 for Windows; Molecular Discovery Ltd.: 215 Marsh Road, HA5 5NE Pinner, Middlesex, U.K. 40. Bergmann, R.; Liljefors, T.; Sorensen, M. D.; Zamora, I. SHOP: Receptor-Based Scaffold HOPping by GRID-Based Similarity Searches. Journal of Chemical Information and Modeling 2009, 49, 658-669. 41. Rasko, D. A.; Sperandio, V. Anti-virulence strategies to combat bacteria-mediated disease. Nat Rev Drug Discov 9, 117-128. 42. Finberg, Robert W.; Moellering, Robert C.; Tally, Francis P.; Craig, William A.; Pankey, George A.; Dellinger, E.  P.; West, Michael A.; Joshi, M.; Linden, Peter K.; Rolston, Ken V.; Rotschafer, John C.; Rybak, Michael J. The Importance of Bactericidal Drugs: Future Directions in Infectious Disease. Clinical Infectious Diseases 2004, 39, 1314-1320. 43. Alekshun, M. N.; Levy, S. B. Molecular Mechanisms of Antibacterial Multidrug Resistance. Cell 2007, 128, 1037-1050. 44. Witte, W. International dissemination of antibiotic resistant strains of bacterial pathogens. Infection, Genetics and Evolution 2004, 4, 187-191. 45. Hueck, C. J. Type III protein secretion systems in bacterial pathogens of animals and plants. Microbiol. Mol. Biol. Rev. 1998, 62, 379-433. 46. Cornelis, G. R. Molecular and cell biology aspects of plague. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 8778-8783. 47. Cornelis, G. R.; WolfWatz, H. The Yersinia Yop virulon: A bacterial system for subverting eukaryotic cells. Mol. Microbiol. 1997, 23, 861-867. 48. Muller, S.; Feldman, M. F.; Cornelis, G. R. The type III secretion system of Gram-negative bacteria: a potential therapeutic target? Expert Opin. Ther. Targets 2001, 5, 327-339. 49. Gauthier, A.; Finlay, B. B. Type III secretion system inhibitors are potential antimicrobials - Present in disease-causing gram-negative bacteria, components of this system might be good targets for novel antimicrobial agents. ASM News 2002, 68, 383-387. 50. Kauppi, A. M.; Nordfelth, R.; Uvell, H.; Wolf-Watz, H.; Elofsson, M. Targeting bacterial virulence: Inhibitors of type III secretion in Yersinia. Chem. Biol. 2003, 10, 241-249. 51. Kauppi, A. M.; Andersson, C. D.; Norberg, H. A.; Sundin, C.; Linusson, A.; Elofsson, M. Inhibitors of type III secretion in Yersinia: design,
63
synthesis and multivariate QSAR of 2-arylsulfonylamino-benzanilides. Bioorg. Med. Chem. 2007, 15, 6994-7011. 52. Kauppi, A. M.; Nordfelth, R.; Hagglund, U.; Wolf-Watz, H.; Elofsson, M. Salicylanilides are potent inhibitors of type III secretion in Yersinia. Adv. Exp. Med. Biol. 2003, 529, 97-100. 53. Nordfelth, R.; Kauppi, A. M.; Norberg, H. A.; Wolf-Watz, H.; Elofsson, M. Small-molecule inhibitors specifically targeting type III secretion. Infect. Immun. 2005, 73, 3104-3114. 54. Keyser, P.; Elofsson, M.; Rosell, S.; Wolf-Watz, H. Virulence blockers as alternatives to antibiotics: type III secretion inhibitors against Gram-negative bacteria. J. Int. Med. 2008, 264, 17-29. 55. OMEGA. OpenEye Scientific Software, Inc., 2.3.2; 9 Bisbee, Suite D, Santa FE, New Mexico, United States of America. 56. Terada, H.; Goto, S.; Yamamoto, K.; Takeuchi, I.; Hamada, Y.; Miyake, K. Structural requirements of salicylanilides for uncoupling activity in mitochondria - Quantitative-analysis of structure-uncoupling relationships. Biochim. Biophy. Acta 1988, 936, 504-512. 57. Wold, S. Pattern recognition by means of disjoint principal components models. Pattern Recogn. 1976, 8, 127-139. 58. Umetrics. SIMCA-P+ 12.0; Box 7960, S-907 19 Umeå, Sweden. 59. Bailey, L.; Gylfe, A.; Sundin, C.; Muschiol, S.; Elofsson, M.; Nordstrom, P.; Henriques-Normark, B.; Lugert, R.; Waldenstrom, A.; Wolf-Watz, H.; Bergstrom, S. Small molecule inhibitors of type III secretion in Yersinia block the Chlamydia pneumoniae infection cycle. FEBS Lett. 2007, 581, 587-595. 60. Muschiol, S.; Bailey, L.; Gylfe, A.; Sundin, C.; Hultenby, K.; Bergstrom, S.; Elofsson, M.; Wolf-Watz, H.; Normark, S.; Henriques-Normark, B. A small-molecule inhibitor of type III secretion inhibits different stages of the infectious cycle of Chlamydia trachomatis. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 14566-14571. 61. Muschiol, S.; Normark, S.; Henriques-Normark, B.; Subtil, A. Small molecule inhibitors of the Yersinia type III secretion system impair the development of Chlamydia after entry into host cells. BMC Microbiol. 2009, 9, 75. 62. Prantner, D.; Nagarajan, U. M. Role for the Chlamydial type III secretion apparatus in host cytokine expression. Infect. Immun. 2009, 77, 76-84. 63. Slepenkin, A.; Enquist, P. A.; Hagglund, U.; de la Maza, L. M.; Elofsson, M.; Peterson, E. M. Reversal of the antichlaraydial activity of putative type III secretion inhibitors by iron. Infect. Immun. 2007, 75, 3478-3489.
64
64. Wolf, K.; Betts, H. J.; Chellas-Gery, B.; Hower, S.; Linton, C. N.; Fields, K. A. Treatment of Chlamydia trachomatis with a small molecule inhibitor of the Yersinia type III secretion system disrupts progression of the chlamydial developmental cycle. Mol. Microbiol. 2006, 61, 1543-1555. 65. Chin, J. W.; Santoro, S. W.; Martin, A. B.; King, D. S.; Wang, L.; Schultz, P. G. Addition of p-azido-L-phenylaianine to the genetic code of Escherichia coli. Journal of the American Chemical Society 2002, 124, 9026-9027. 66. Lindgren, J. Synthesis and biological evaluation of possible inhibitors of type III secretion in Yersinia carrying an azide group to enable cross-linking to the target proteins; Bachelor thesis; Department of Chemistry, Umeå University: Umeå, 2006. 67. MacroModel, version 9.5; Schrödinger, LCC: New York, NY, 2007. 68. Sato, K.; Tarui, A.; Kita, T.; Ishida, Y.; Tamura, H.; Omote, M.; Ando, A.; Kumadaki, I. Rhodium-catalyzed Reformatsky-type reaction of ethyl bromodifluoroacetate. Tetrahedron Letters 2004, 45, 5735-5737. 69. Bagley, M. C.; Hughes, D. D.; Lubinu, M. C.; Merritt, E. A.; Taylor, P. H.; Tomkinson, N. C. O. Microwave-assisted synthesis of pyrimidine libraries. Qsar & Combinatorial Science 2004, 23, 859-867. 70. Manvar, A.; Malde, A.; Verma, J.; Virsodia, V.; Mishra, A.; Upadhyay, K.; Acharya, H.; Coutinho, E.; Shah, A. Synthesis, anti-tubercular activity and 3D-QSAR study of coumarin-4-acetic acid benzylidene hydrazides. European Journal of Medicinal Chemistry 2008, 43, 2395-2403. 71. Yang, H. Y.; Shen, Y.; Chen, J. H.; Jiang, Q. F.; Leng, Y.; Shen, J. H. Structure-based virtual screening for identification of novel 11 beta-HSD1 inhibitors. European Journal of Medicinal Chemistry 2009, 44, 1167-1171. 72. Liu, W. Y.; Li, H. Y.; Zhao, B. X.; Shin, D. S.; Lian, S.; Miao, J. Y. Synthesis of novel ribavirin hydrazone derivatives and anti-proliferative activity against A549 lung cancer cells. Carbohydrate Research 2009, 344, 1270-1275. 73. Vicini, P.; Incerti, M.; La Colla, P.; Loddo, R. Anti-HIV evaluation of benzo[d]isothiazole hydrazones. European Journal of Medicinal Chemistry 2009, 44, 1801-1807. 74. Mircus, G.; Hagag, S.; Levdansky, E.; Sharon, H.; Shadkchan, Y.; Shalit, I.; Osherov, N. Identification of novel cell wall destabilizing antifungal compounds using a conditional Aspergillus nidulans protein kinase C mutant. Journal of Antimicrobial Chemotherapy 2009, 64, 755-763. 75. Romeiro, N. C.; Aguirre, G.; Hernandez, P.; Gonzalez, M.; Cerecetto, H.; Aldana, I.; Perez-Silanes, S.; Monge, A.; Barreiro, E. J.; Lima, L. M. Synthesis, trypanocidal activity and docking studies of novel
65
quinoxaline-N-acylhydrazones, designed as cruzain inhibitors candidates. Bioorganic & Medicinal Chemistry 2009, 17, 641-652. 76. Cosconati, S.; Marinelli, L.; Trotta, R.; Virno, A.; Mayol, L.; Novellino, E.; Olson, A. J.; Randazzo, A. Tandem Application of Virtual Screening and NMR Experiments in the Discovery of Brand New DNA Quadruplex Groove Binders. Journal of the American Chemical Society 2009, 131, 16336-+. 77. He, L. Y.; Zhang, L.; Liu, X. F.; Li, X. H.; Zheng, M. Y.; Li, H. L.; Yu, K. Q.; Chen, K. X.; Shen, X.; Jiang, H. L.; Liu, H. Discovering Potent Inhibitors Against the beta-Hydroxyacyl-Acyl Carrier Protein Dehydratase (FabZ) of Helicobacter pylori: Structure-Based Design, Synthesis, Bioassay, and Crystal Structure Determination. Journal of Medicinal Chemistry 2009, 52, 2465-2481. 78. Patkar, C. G.; Larsen, M.; Owston, M.; Smith, J. L.; Kuhn, R. J. Identification of Inhibitors of Yellow Fever Virus Replication Using a Replicon-Based High-Throughput Assay. Antimicrobial Agents and Chemotherapy 2009, 53, 4103-4114. 79. Marlo, J. E.; Niswender, C. M.; Days, E. L.; Bridges, T. M.; Xiang, Y.; Rodriguez, A. L.; Shirey, J. K.; Brady, A. E.; Nalywajko, T.; Luo, Q.; Austin, C. A.; Williams, M. B.; Kim, K.; Williams, R.; Orton, D.; Brown, H. A.; Lindsley, C. W.; Weaver, C. D.; Conn, P. J. Discovery and Characterization of Novel Allosteric Potentiators of M-1 Muscarinic Receptors Reveals Multiple Modes of Activity. Molecular Pharmacology 2009, 75, 577-588. 80. Patel, V.; Mazitschek, R.; Coleman, B.; Nguyen, C.; Urgaonkar, S.; Cortese, J.; Barker, R. H.; Greenberg, E.; Tang, W. P.; Bradner, J. E.; Schreiber, S. L.; Duraisingh, M. T.; Wirth, D. F.; Clardy, J. Identification and Characterization of Small Molecule Inhibitors of a Class I Histone Deacetylase from Plasmodium falciparum. Journal of Medicinal Chemistry 2009, 52, 2185-2187. 81. Peterson, Q. P.; Hsu, D. C.; Goode, D. R.; Novotny, C. J.; Totten, R. K.; Hergenrother, P. J. Procaspase-3 Activation as an Anti-Cancer Strategy: Structure-Activity Relationship of Procaspase-Activating Compound 1 (PAC-1) and Its Cellular Co-Localization with Caspase-3. Journal of Medicinal Chemistry 2009, 52, 5721-5731. 82. Chae, J. Practical demethylation of aryl methyl ethers using an odorless thiol reagent. Archives of Pharmacal Research 2008, 31, 305-309.