Molecular Modelling of Estrogren-Producing Enzymes CYP450 ...
Transcript of Molecular Modelling of Estrogren-Producing Enzymes CYP450 ...
Molecular Modelling ofEstrogen-Producing Enzymes CYP450Aromatase and 17β-Hydroxysteroid
Dehydrogenase Type 1
Sampo Karkola
Laboratory of Organic ChemistryDepartment of Chemistry
Faculty of ScienceUniversity of Helsinki
Finland
Academic Dissertation
To be presented with the permission of the Faculty of Science of the University ofHelsinki for public criticism in Auditorium A110 in the Department of Chemistry,
February 28th, at 12 o’clock.
Helsinki 2009
SupervisorProfessor Kristiina WahalaLaboratory of Organic ChemistryDepartment of ChemistryFaculty of ScienceUniversity of HelsinkiFinland
ReviewersProfessor Wolfgang SipplInstitute of PharmacyFaculty I of Natural Science–Biological ScienceThe Martin Luther University of Halle–WittenbergGermany
Professor Jerzy AdamskiGenome Analysis CentreInstitute of Experimental GeneticsGerman Research Center for Environmental HealthGermany
OpponentDr. Hugo Kubinyiass. Professor of Pharmaceutical ChemistryUniversity of HeidelbergGermany
ISBN 978-952-92-4942-8 (paperback)ISBN 978-952-10-5186-9 (PDF)
Edita Prima OyHelsinki 2009
ii
Contents
1 Introduction 1
2 Estrogen biosynthesis 5
3 CYP450 aromatase 9
3.1 Substrate recognition sites . . . . . . . . . . . . . . . . . . . . . 12
3.2 Mechanism of the catalytic reaction . . . . . . . . . . . . . . . . 13
3.3 Homology modelling of aromatase . . . . . . . . . . . . . . . . 15
3.4 Mutagenesis studies on aromatase . . . . . . . . . . . . . . . . 20
3.5 Phytoestrogens as aromatase inhibitors . . . . . . . . . . . . . 23
3.6 Commercial aromatase inhibitors . . . . . . . . . . . . . . . . . 28
4 17β-Hydroxysteroid dehydrogenase type 1 31
4.1 Structure and catalytic mechanism of 17β-HSD1 . . . . . . . . 32
4.2 Molecular modelling of 17β-HSD1 . . . . . . . . . . . . . . . . 34
5 Methods used in molecular modelling 39
5.1 Force fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2 Homology modelling . . . . . . . . . . . . . . . . . . . . . . . . 43
iii
5.3 Molecular dynamics simulations . . . . . . . . . . . . . . . . . 46
5.3.1 Periodic boundary conditions . . . . . . . . . . . . . . . 47
5.3.2 Long-range interactions . . . . . . . . . . . . . . . . . . 47
5.3.3 Temperature and pressure monitoring . . . . . . . . . . 48
5.4 MDS trajectory analysis and protein model validation . . . . . 48
5.5 Ligand–protein docking . . . . . . . . . . . . . . . . . . . . . . 52
Search methods and scoring functions . . . . . . . . . . . . . . 53
5.6 QSAR methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.7 Pharmacophore modelling and virtual screening . . . . . . . . 57
6 Aims of the study 59
7 Results and discussion 61
7.1 The aromatase model and the bindingof phytoestrogens (I and II) . . . . . . . . . . . . . . . . . . . . 61
7.2 Molecular modelling of 17β-HSD1and its inhibitors (III and IV) . . . . . . . . . . . . . . . . . . . . 69
8 Conclusions 79
Bibliography
Structures of the essential amino acids
Original publications
iv
Abstract
Breast cancer is the most common cancer in women in the western coun-
tries. Approximately two-thirds of breast cancer tumours are hormone de-
pendent, requiring estrogens to grow. Estrogens are formed in the human
body via a multistep route starting from cholesterol. The final steps in the
biosynthesis include the CYP450 aromatase enzyme, converting the male
hormones androgens (preferred substrate androstenedione ASD) into estro-
gens (estrone E1), and the 17β-HSD1 enzyme, converting the biologically
less active E1 into the active hormone 17β-hydroxyestradiol E2. E2 is bound
to the nuclear estrogen receptors causing a cascade of biochemical reactions
leading to cell proliferation in normal tissue, and to tumour growth in can-
cer tissue. Aromatase and 17β-HSD1 are expressed in or near the breast tu-
mour, locally providing the tissue with estrogens. One approach in treating
hormone dependent breast tumours is to block the local estrogen produc-
tion by inhibiting these two enzymes. Aromatase inhibitors are already on
the market in treating breast cancer, despite the lack of an experimentally
solved structure. The structure of 17β-HSD1, on the other hand, has been
solved, but no commercial drugs have emerged from the drug discovery
projects reported in the literature.
Computer-assisted molecular modelling is an invaluable tool in modern
drug design projects. Modelling techniques can be used to generate a model
of the target protein and to design novel inhibitors for them even if the tar-
get protein structure is unknown. Molecular modelling has applications in
predicting the activities of theoretical inhibitors and in finding possible ac-
tive inhibitors from a compound database. Inhibitor binding at atomic level
can also be studied with molecular modelling.
To clarify the interactions between the aromatase enzyme and its substrate
and inhibitors, we generated a homology model based on a mammalian
CYP450 enzyme, rabbit progesterone 21-hydroxylase CYP2C5. The model
was carefully validated using molecular dynamics simulations (MDS) with
and without the natural substrate ASD. Binding orientation of the inhibitors
was based on the hypothesis that the inhibitors coordinate to the heme iron,
and were studied using MDS. The inhibitors were dietary phytoestrogens,
which have been shown to reduce the risk for breast cancer. To further val-
idate the model, the interactions of a commercial breast cancer drug were
studied with MDS and ligand–protein docking. In the case of 17β-HSD1, a
3D QSAR model was generated on the basis of MDS of an enzyme complex
with active inhibitor and ligand–protein docking, employing a compound
library synthesised in our laboratory. Furthermore, four pharmacophore
hypotheses with and without a bound substrate or an inhibitor were devel-
oped and used in screening a commercial database of drug-like compounds.
The homology model of aromatase showed stable behaviour in MDS and
was capable of explaining most of the results from mutagenesis studies. We
were able to identify the active site residues contributing to the inhibitor
binding, and explain differences in coordination geometry corresponding
to the inhibitory activity. Interactions between the inhibitors and aromatase
were in agreement with the mutagenesis studies reported for aromatase.
Simulations of 17β-HSD1 with inhibitors revealed an inhibitor binding mode
with hydrogen bond interactions previously not reported, and a hydropho-
bic pocket capable of accommodating a bulky side chain. Pharmacophore
hypothesis generation, followed by virtual screening, was able to identify
several compounds that can be used in lead compound generation. The
visualisation of the interaction fields from the QSAR model and the phar-
macophores provided us with novel ideas for inhibitor development in our
drug discovery project.
vi
Acknowledgements
This study was carried out in the Laboratory of Organic Chemistry, Depart-
ment of Chemistry, University of Helsinki. I am most grateful to my super-
visor, Professor Kristiina Wahala for introducing me to the field of molecu-
lar modelling. She has provided me with excellent tools to work with, from
computer hardware to state-of-the-art software. Moreover, her positive atti-
tude during the whole research project has encouraged me to never give up
in finding solutions to the problems encountered during the work. She has
also provided me with several interesting side projects, including organising
conferences, building the NetLab website and generating the EChemTest
database. I want to express my gratitude also to Professor Emeritus Tapio
Hase for accepting me as a graduate student to the Laboratory and for al-
ways being available to guide me through chemical problems.
My first steps in the field of molecular modelling were taken in the Heinrich
Heine University, Dusseldorf, Germany, under the supervision of Professor
Doctor h.c. Hans-Dieter Holtje. I am ever so grateful to him for accepting me
as a member of his research group and teaching me the basics of molecular
modelling. His excellent advice have helped me during my research work.
Professor Wolfgang Sippl and Professor Jerzy Adamski are thanked for re-
viewing this manuscript and for their constructive comments, and Doctor
Kathleen Ahonen is thanked for revising the language in this manuscript.
The research work and several trips to conferences would not have been
possible without The Finnish Academy, The National Graduate School of
Organic Chemistry and Chemical Biology, The University of Helsinki Re-
search Fund and the University of Helsinki Chancellor’s travel grant.
CSC – IT Center for Science Ltd. is thanked for supercomputer time and
excellent scientific and technical advice during the work.
I would like to thank the personnel at the Laboratory of Organic Chemistry
for always providing me with advice whenever I needed it. Special thanks
go to Dr. Jorma Koskimies for being a casual supervisor at the teaching
laboratory and a colleague in ECTN projects. Dr. Leena Kaisalo has been
an excellent colleague in organising conferences and Anna-Maaret Jarvikivi
has helped me in many administrative issues.
All the former and present members of our Phyto-Syn research group have
always made the working environment very pleasant, both on and off the
working hours. Special thanks go to Nanna and Antti for being such good
friends and giving at least one good laugh per day. Valtteri has helped me in
several computer-related problems, and despite my initial reluctance, talked
me into using Linux and LATEX.
My parents Liisa and Kape are, of course, thanked for their endless sup-
port and encouragement in every decision I have ever made in my life. My
siblings Petri, Venla, Eliisa and Teppana, together with their spouses, my
friends and cousins, and godparents Malla and Lasu, have always let me re-
lax and forget molecular modelling by not asking too many questions about
my work. Lauri Karmila / citronacid.com is thanked for designing the cover
of this book.
Finally and most importantly: Minttu, Aada and Lassi, you are the world to
me.
Helsinki, January 2009
Sampo Karkola
viii
List of original publications
This thesis is based on the following original publications, which are re-
ferred to in the text by their Roman numerals, and on unpublished results
described in the text.
I Sampo Karkola, Hans-Dieter Holtje and Kristiina Wahala, A three-
dimensional model of CYP19 aromatase for structure-based drug de-
sign. The Journal of Steroid Biochemistry and Molecular Biology 105, 63-70,
2007.
II Sampo Karkola and Kristiina Wahala, The binding of lignans, flavo-
noids and coumestrol to CYP450 aromatase: A molecular modelling
study. Molecular and Cellular Endocrinology, in press.
DOI:10.1016/j.mce.2008.10.003
III Sampo Karkola, Annamaria Lilienkampf and Kristiina Wahala, A 3D
QSAR model of 17β-HSD1 inhibitors based on a thieno[2,3-d]pyrim-
idin-4(3H)-one core applying molecular dynamics simulations and li-
gand–protein docking. ChemMedChem 3, 461-472, 2008.
IV Sampo Karkola, Sari Alho-Richmond and Kristiina Wahala, Pharma-
cophore modelling of 17β-HSD1 enzyme based on active inhibitors
and enzyme structure. Molecular and Cellular Endocrinology, in press.
DOI:10.1016/j.mce.2008.08.030
The author’s contribution to the publications is as follows:
I Model building, validation, analyses and manuscript preparation.
II Simulations and analyses of the complexes, dockings and their analy-
sis and manuscript preparation.
III Simulations and analyses, dockings and their analysis, building and
analysis of the 3D QSAR model and most of the manuscript prepara-
tion. This publication was also part of Dr. Annamaria Lilienkampf’s
thesis (University of Helsinki, 2007) concerning the synthesis of 17β-
HSD1 inhibitors.
IV Pharmacophore generations, virtual screening experiments, analyses
and manuscript preparation.
x
List of Abbreviations
17β-HSDcl . . . . . . 17β-hydroxysteroid dehydrogenase
from Cochliobolus lunatus
17β-HSD1 . . . . . . . human 17β-hydroxysteroid dehydrogenase
type 1 enzyme
AI . . . . . . . . . . . . . . . aromatase inhibitor
ANF . . . . . . . . . . . . α-naphthoflavone
ASD . . . . . . . . . . . . . androstenedione
CoMFA . . . . . . . . . . comparative molecular field analysis
CoMSIA . . . . . . . . . comparative molecular similarity indices analysis
CYP2C5 . . . . . . . . . rabbit cytochrome P450 2C5, progesterone 21-hydroxylase
CYP2C9 . . . . . . . . . human cytochrome P450 2C9, a drug
metabolising enzyme
CYP450 . . . . . . . . . cytochrome P450
CYP450arom . . . . . . cytochrome P450 aromatase
CYP450BM−3 . . . . . cytochrome P450 fatty acid monooxygenase
from Bacillus megaterium
CYP450cam . . . . . . . cytochrome P450 camphor monooxygenase
from Pseudomonas putida
CYP450eryF . . . . . . cytochrome P450 6-deoxyerythronolide B hydroxylase
from Saccaropolyspora erthrea
CYP450scc . . . . . . . cytochrome P450 side-chain cleavage enzyme
CYP450terp . . . . . . . cytochrome P450 α-terpineol monooxygenase
from Pseudomonas putida
xi
DFT . . . . . . . . . . . . . density functional theory
DHEA . . . . . . . . . . . dehydroepiandrosterone
DHEA-S . . . . . . . . . dehydroepiandrosterone sulfate
END . . . . . . . . . . . . enterodiol
ENL . . . . . . . . . . . . . enterolactone
E1 . . . . . . . . . . . . . . . estrone
E2 . . . . . . . . . . . . . . . 17β-estradiol
GA . . . . . . . . . . . . . . genetic algorithm
HTS . . . . . . . . . . . . . high-throughput screening
IC50 . . . . . . . . . . . . . inhibition concentration required to reduce
the biological activity by half
Ki . . . . . . . . . . . . . . . equilibrium dissociation constant
KM . . . . . . . . . . . . . . Michaelis-Menten constant
MAT . . . . . . . . . . . . matairesinol
MDS . . . . . . . . . . . . molecular dynamics simulation
NADPH . . . . . . . . . reduced form of nicotinamide adenine dinucleotide
phosphate
NADP+ . . . . . . . . . nicotinamide adenine dinucleotide phosphate
NDGA . . . . . . . . . . nordihydroguaiaretic acid
PDB . . . . . . . . . . . . . The Protein Data Bank
PLS . . . . . . . . . . . . . partial least squares analysis
PME . . . . . . . . . . . . particle-mesh Ewald method
QSAR . . . . . . . . . . . quantitative structure-activity relationship
SDR . . . . . . . . . . . . . short-chain dehydrogenase/reductase enzyme
SERM . . . . . . . . . . . selective estrogen receptor modulator
SI . . . . . . . . . . . . . . . secoisolariciresinol
STS . . . . . . . . . . . . . . steroid sulfatase
ST . . . . . . . . . . . . . . . steroid sulfotransferase
T . . . . . . . . . . . . . . . . testosterone
Vmax . . . . . . . . . . . . . the maximal reaction velocity of an enzyme
xii
Chapter 1
Introduction
Molecular modelling has become an essential tool in several fields of sci-
ence, including chemistry, physics, drug discovery, and biochemistry, to
name a few. The first efforts to model the structure of molecules were the
hand-drawn sketches of atoms joined together with bonds that appeared
in 1860s. Understandably, these models provided little (and then proba-
bly false) information about the actual 3D structure or the physico-chemical
properties of the molecule. In 1874, the Dutch chemist Jacobus Henricus
van’t Hoff and the French chemist Joseph Le Bel independently identified
the tetrahedral carbon as responsible for phenomena observed during their
studies on optical activity and stereochemistry. Since then, modelling has
developed to the Dreiding stick models used in solving the structure of
DNA, and now to computer-assisted studies. Today, modelling is done
solely on computers, and as the computational resources and method de-
velopment continue to expand, increasingly sophisticated models can be
built, simulated and analysed. With multi-core supercomputers and par-
allel computing, highly accurate calculations can be performed on small
molecules and larger biological systems, such as membrane-bound proteins
and lipid membranes containing large biomolecules totalling as many as
million atoms, can be modelled an simulated.
1
Chapter 1. Introduction
Breast cancer is the most common cancer in women in Western countries.
More than 180,000 new cases and over 40,000 deaths are expected to occur
in the United States in 2008 [1]. Approximately 60 % of pre-menopausal
and 75 % of post-menopausal cancers are hormone dependent [2]. In these
cancers, the cell proliferation and, therefore, tumour formation and devel-
opment are dependent on the availability of endogenous estrogens. These
are produced mainly by the ovaries, but also in adipose tissue, breast, skin
and bone [3]. Estrogen receptor-positive breast cancer can be treated by in-
terfering with either estrogen production or estrogen action [4]. To block
estrogen production, it is necessary to inhibit the enzymes responsible for
producing the estrogens: namely, CYP450 aromatase by using aromatase
inhibitors (AI), or 17β-hydroxysteroid dehydrogenase type 1 (17β-HSD1).
Inhibitors for 17β-HSD1 have not yet been developed. To prevent estrogen
action, the estrogen receptors in the tumour tissue cells are blocked with
selective estrogen receptor modulators (SERM). While SERMs can be used
in treating the breast cancer of both pre- and post-menopausal women, AIs
are only effective administered to post-menopausal women since they do
not stop the estrogen production in the ovaries of pre-menopausal women
[1] but only the local estrogen production in the cancer tissue. AIs are also
used to induce ovulation in pre-menopausal women [5]. When estrogen
production in the ovaries of pre-menopausal women is reduced, the levels
of gonadotropins become elevated and follicular growth is stimulated.
This thesis describes molecular modelling studies on two estrogen-produc-
ing human enzymes, aromatase and 17β-HSD1. The structure of aromatase
has yet to be solved, and therefore homology modelling (comparative mod-
elling) was used to predict its 3D structure. The model of aromatase was
built using the first crystallised mammalian CYP450 enzyme, rabbit CYP2C5,
as template. The model was validated with exhaustive molecular dynamics
simulations (MDS) with and without the natural substrate, androstenedione
(ASD). With the model in hand, the binding of dietary phytoestrogens and
their metabolites was studied.
2
The crystal structure of 17β-HSD1 was solved in 1995, and it was used in
the modelling work to study the interactions between the 17β-HSD1 en-
zyme and inhibitors synthesised in our laboratory. The dynamic process of
inhibitor binding to 17β-HSD1 was studied by MDS with use of the crys-
tal structure of the enzyme. Plausible binding modes of our 17β-HSD1 in-
hibitors were predicted with ligand–protein docking. A 3D QSAR model of
the inhibitors was created using biological data measured at Hormos Ltd.
and the alignment obtained from the dockings. Additionally, the important
interactions between the enzyme and the inhibitors were identified by phar-
macophore modelling. The resulting pharmacophore models were used as
a query in virtual screening searches aimed at finding new lead compounds
for our drug discovery project.
The topics reviewed in the following chapters 2-5 concern estrogen biosyn-
thesis and the roles of aromatase and 17β-HSD1 in it, previous molecular
modelling studies performed on these two enzymes, the role of phytoe-
strogens as aromatase inhibitors, and methods used in modern computer-
assisted molecular modelling.
3
Chapter 2
Estrogen biosynthesis
Estrogens are endogenous female hormones having a tetracyclic structure
with three six-membered rings and one five-membered ring fused together
to form a rigid and planar molecular skeleton. Estrogens differ from male
hormones, androgens, in having an aromatic ring A and lacking the C19
methyl group. Figure 2.1 shows the structures and stereochemistry of an-
drostenedione 1 (ASD) and estradiol 2 (E2).
22
33
4455
1010
11
66
77
88
99 1414
1313
1212
1111
1515
1616
1717
1818 O
O
an androgen(androstenedione ASD 1)
1919
OH
HO
an estrogen(17β-estradiol E2 2)
H
H
H H
H
H
Figure 2.1 – The structures of an androgen and an estrogen withstereochemistry and atom numbering shown. For clarity, hydro-gens connected to carbon atoms are omitted from the 3D models.Oxygen atoms are coloured red.
5
Chapter 2. Estrogen biosynthesis
Estrogens are produced in a multistep route starting from cholesterol 3,
which is first converted to pregnenolone 4 by cytochrome P450 side-chain
cleavage enzyme (CYP11A1, CYP450scc) (Figure 2.2). [6, 7] From pregnen-
olone 4, two major routes are possible. The ∆5 pathway, preferred in hu-
mans, proceeds via conversion to 17α-hydroxypregnenolone 5 by CYP17
(17α-hydroxylase), and to dehydroepiandrosterone 6 (DHEA) by the same
enzyme. DHEA 6 is converted to ASD 1 by 3β-hydroxysteroid dehydro-
genase (3β-HSD) and can be further converted to testosterone 7 (T) by 17β-
hydroxysteroid dehydrogenase (17β-HSD) enzymes. The ∆4 pathway, dom-
inant in rat, hamster and guinea-pig [8], proceeds via the conversion of preg-
nenolone to progesterone, 17α-hydroxyprogesterone and ASD. Either way,
the aliphatic ring A of the androgen (ASD 1 or T 7) is finally aromatised
by the aromatase enzyme complex to produce the estrogen (estrone E1 8 or
estradiol E2 2) (Figure 2.3). E1 8, the less potent estrogen, is converted to
the biologically active estrogen E2 2 by 17β-HSD1. After menopause, the
main sites for estrogen production are the peripheral tissues such as adi-
pose tissue, breast tissue and skin. [9] Here, estrogens, not to be delivered to
the circulation, but to be used in situ, are produced from adrenal precursors
DHEA 6 and its 3-sulfate (DHEA-S).
HO
cholesterol 3
HO
O
pregnenolone 4
CYP11A1 CYP17
HO
O
dehydroepiandrosterone DHEA 6
HO
O
17α-hydroxypregnenolone 5
OH
CYP17
O
O
androstenedione ASD 1
3β-HSD
Figure 2.2 – The formation of androstenedione from cholesterol.
6
Aromatase and 17β-HSD1 enzymes are the key players in the final steps of
estrogen biosynthesis leading to the inactive precursor E1 8 and the active
estrogen E2 2, respectively. The storage forms of estrogens (estrone-3-sulfate
9 and estradiol-3-sulfate 10) are formed by linkage of a sulfate group to the
phenolic 3-hydroxy group by steroid sulfotransferase (ST) (Figure 2.3). [10]
Most of the circulating estrogens are, in fact, sulfoconjugates, cleaved by
the steroid sulfatase enzyme (STS) when needed.[11] Estrogens bind to the
nuclear estrogen receptors (α or β) and cause a cascade of reactions lead-
ing to cell proliferation and, in cancer tissue, to tumour development [10].
Estrogens also exert extra-nuclear effects via kinases, ion channels, G pro-
teins and growth factor receptors, in addition to the genomic effects caused
by nuclear receptor binding.[12] Interfering with either estrogen formation
or its cellular effects has been a target for controlling hormone-dependent
tumour formation and development. The interference of estrogen forma-
tion involves blocking the actions of the endogenous enzymes producing
estrogens. Estrogen-producing enzymes include aromatase, which converts
O
HO
estrone E1 8
O
O
androstenedione ASD 1
aromatase
OH
O
testosterone T 7
OH
HO
estradiol E2 2
17β-HSD217β-HSD417β-HSD817β-HSD10
17β-HSD117β-HSD517β-HSD7
17β-HSD217β-HSD8
17β-HSD317β-HSD5
aromatase
O
O
OH
O
estradiol-3-sulfate 10
S
O
OHO
S
O
OHO
estrone-3-sulfate 9
STS
ST
STS
ST
Figure 2.3 – The final steps and the enzymes involved in estrogenbiosynthesis.
androgens to estrogens; 17β-HSD1, which converts estrone E1 8 (the prod-
uct of aromatase) to the biologically active estradiol E2 2, and STS, which
7
Chapter 2. Estrogen biosynthesis
releases estrogens from their sulfate storage forms. Interfering with the cel-
lular effects of estrogen, when it binds to cell nuclear receptors, involves
blocking of the estrogen receptor functions. This can be achieved with se-
lective estrogen receptor modulators (SERMs), which bind to the estrogen
receptors in place of the actual hormone. Commercial drugs representing
both aromatase inhibitors and SERMs are now widely used in the treatment
of hormone-dependent cancers.
8
Chapter 3
CYP450 aromatase
CYP450 enzymes catalyse a wide range of reactions from xenobiotic me-
tabolism to drug metabolism and steroid biosynthesis [13, 14]. Despite the
marked variation in sequence identities, the CYP450 enzymes share a com-
mon tertiary fold consisting of 12 – 15 α-helices (A through L) and four
β-sheets (1 through 4) (Figure 3.1) [15]. The eucaryotic CYP450 enzymes are
bound to the endoplasmic reticulum via a transmembrane helix located at
the N-terminus of the sequence. Because of this binding, solubilisation of
the enzymes requires the use of detergents, which easily destroy the del-
icate protein fold. A catalytically crucial protoporphyrin IX group with a
bound iron atom (the heme group, Figure 3.2) is located in the active site
of the enzymes. The iron atom is responsible for the oxidations performed
to the substrates. The iron in the heme ring has an octahedral coordina-
tion geometry with six coordination sites, with four of them occupied by
the nitrogens of the planar protoporphyrin ring. In contrast to other cy-
tochromes having a nitrogenous base coordinating to the heme iron as a
fifth ligand, the CYP450 enzymes have a thiolate anion, the side chain of
a cysteine, as fifth ligand (C437 in aromatase). The thiolate anion is essen-
tial for the activity, as is shown by mutating the coordinating cysteine to
a histidine for CYP450 camphor monooxygenase (CYP450cam): the mutant
9
Chapter 3. CYP450 aromatase
enzyme catalyzes the hydroxylation of camphor but at a much lower rate
than the wild-type enzyme [16]. The sixth coordination site is part of the
functionality of the active site and is occupied by a water molecule in the
steady-state form of the enzyme [17]. Before the catalytic reaction, the iron
binds a diatomic oxygen molecule, which provides the oxygen with which
the substrate is hydroxylated. The sixth coordination site is also the tar-
get for inhibitor design, providing the site for the inhibitor’s coordinative
counterpart (usually a basic nitrogen). The catalytic reaction it performs has
caused the CYP450 family to be referred to as the biological equivalent of a
blow-torch [18]. In the laboratory, the hydroxylation of an aliphatic carbon
requires drastic reaction conditions.
Figure 3.1 – The overall structure of a CYP450 enzyme (CYP2C5,PDB entry 1N6B). α-Helices are coloured red, β-sheets yellow andthe random coil and loop areas green. Heme is represented assticks and heme iron as a magenta sphere. The transmembranehelix is omitted from the Protein Data Bank structure file.
Aromatase (estrogen synthetase, P-450AROM, CYPXIX, EC 1.14.14.1) is a
single member of the CYP19 family, coded by the CYP19A1 gene and con-
sisting of 503 amino acid residues with total molecular weight of 57.9 kDa.
10
The catalytic complex consists of the aromatase enzyme and a nicotinamide
adenine dinucleotide phosphate (NADPH) cytochrome P450 reductase, a
ubiquitous flavoprotein, which provides the reaction with the required mo-
lar amount of NADPH (Figure 3.2) [19]. The whole complex is bound to
the endoplasmic reticulum membrane in the cell. Aromatase is mostly ex-
pressed in the ovaries of pre-menopausal women, in the placenta of preg-
nant women and additionally in the peripheral adipose tissue, breast tis-
sue and brain [20]. It is overexpressed in or near breast cancer tissue and
is responsible for the local estrogen production and proliferation of breast
cancer tissue [21]. The 3D structure of aromatase has yet to be solved due
to the difficulties in crystallisation associated with the membrane-bound
character of the enzyme. The active enzyme without the N-terminal trans-
membrane helix has been produced by recombinant expression techniques
[22, 23, 24, 25, 26]. Unless otherwise indicated, residue numbering in the
following text is from aromatase .
N
N N
N
HOOC COOH
Fe
N
N
NH2
N
N
OP
OO-
O-
O PO
O-O P
O
O-O N
O
NH2
H H
OH
O O
OHOH
heme group NADPH
Figure 3.2 – Left: The structure of the prosthetic group proto-porphyrin IX with a bound iron atom (the heme group) in aro-matase. Right: The structure of the cofactor NADPH, the reducedform of nicotinamide adenine dinucleotide phosphate, in CYP450-NADPH reductase and in 17β-HSD1.
11
Chapter 3. CYP450 aromatase
3.1 Substrate recognition sites
The active site of the CYP450 enzymes consists of the heme ring at one
wall and six common substrate recognition sites (SRS), as defined by Go-
toh [27] for CYP2 family enzymes (Figure 3.3). SRS-1 is located between the
B- and C-helices, where an additional B’-helix is observed in some CYP450
enzymes. SRS-2 and SRS-3, or the C-terminus of the F-helix and the N-
terminus of the G-helix, are located on the opposite wall of the active site
from the heme ring. This area varies among CYP450 family members show-
ing helices and loops of different lengths and with none, one or two addi-
tional helices (F’ and G’).[28] SRS-4 is located in the I-helix, which protrudes
through the whole enzyme and includes residues that are highly conserved
among the CYP450 enzymes. An interesting residue at SRS-4 is P308, which,
due to the inability to form a stabilising backbone hydrogen bond to the ad-
jacent L304, causes a kink in the I-helix. The active-site wall opposite the
I-helix consists mainly of SRS-5, the hydrophobic coil between the K-helix
and the β1-4-strand. SRS-6 is located in the β-sheet 4 or, more accurately, in
the hairpin turn between the two strands β4-1 and β4-2. This site includes
amino acids thought to be important not only to substrate recognition but
also to the catalytic activity (see Section 3.4 on page 20).
Figure 3.3 – A stereo image of the structural elements correspond-ing to the substrate recognition sites (SRS) of CYP2C5. SRS-1 iscoloured red, SRS-2 green, SRS-3 blue, SRS-4 yellow, SRS-5 cyanand SRS-6 orange. Heme is represented as sticks and the hemeiron as a magenta sphere.
12
3.2. Mechanism of the catalytic reaction
3.2 Mechanism of the catalytic reaction
Human aromatase catalyses a reaction in which the aliphatic ring A of an an-
drogen (ASD 1 or T 7) is converted to the aromatic ring A of an estrogen (E1 8
or E2 2). ASD is the preferred substrate (Figure 3.4) [29]. The overall reaction
requires three moles of oxygen and three moles of NADPH per one mole of
the substrate. The conversion from an aliphatic C19 androgen to an aromatic
C18 estrogen occurs in a sequence of three hydroxylation steps. Before the
O
O
androstenedione ASD 1
O
O
HO
O
O
HOOH
19-hydroxyandrostenedione 11
19-gem-dihydroxyandrostenedione 12
O2, NADPH
aromatase
O
HO
estrone E1 8
+ HCOOH + H2O
C19
O2, NADPH
aromatase
O2, NADPH
aromatase
Figure 3.4 – The overall aromatisation reaction of androgens. TheC19 is hydroxylated in three consecutive steps producing the es-trogen, formic acid and water.
catalytic reaction, the hydroxylating agent iron(IV)-oxo porphyrin radical
cation is formed by the addition of an electron pair, an oxygen molecule
and two protons to the heme iron (Figure 3.5) [17]. The catalysis starts with
the hydroxylation of C19 of ASD to give C19-hydroxyandrostenedione 11
by hydrogen abstraction–oxygen rebound mechanism [17, 30], followed by
another hydroxylation to the same carbon producing C19-gem-dihydroxy-
androstenedione 12. Although the mechanism of the third step has not
yet been solved, density functional theory (DFT) calculations and ab initio
dynamics simulations [31] suggest that first the 1β-hydrogen is abstracted
by the oxygen atom bound to the iron, facilitated by the enolisation of the
3-carbonyl. Subsequently, an electron is transferred from the alkyl radical
13
Chapter 3. CYP450 aromatase
FeIIIIII
FeIIII
e- O2Fe
IIIIII
OO
e-Fe
IIIIII
OO
H+
FeIIIIII
OOH
H+,-H2OFe
VV
O
Fe N
N N
NFe
COOHHOOC
=
FeIVIV
O
+
iron(V)-oxo species
Figure 3.5 – The formation of the iron(V)-oxo species, redrawnfrom reference [17].
FeIVIV
O
+
H
OH
HOOH
FeIVIV
O
OH
HOOH H
e- transfer
FeIIIIII
O
OH
HOO
H
H
FeIIIIII
OHH
-HCOOH
1β-H abstraction
OH
Figure 3.6 – The third step of the aromatase catalysed reaction pro-ducing the estrogen, formic acid and water, redrawn from refer-ence [31].
14
3.3. Homology modelling of aromatase
to the iron, and the gem-diol 12 is deformylated to the estrogen, formic acid
and water (Figure 3.6).
3.3 Homology modelling of aromatase
The lack of an experimental structure for aromatase requires other meth-
ods to be used in studying the overall protein fold and the structure of the
active site. These methods include in silico structure prediction tools such
as homology modelling and threading, as well as mutagenesis studies. Aro-
matase is a distant relative of other CYP450 enzymes, and the amino acid se-
quence identity to other CYP450 enzymes is approximately 20 % [32]. Build-
ing a homology model on the basis of a template with such a low sequence
identity is usually considered to be difficult and the results to be unreliable.
Despite the low sequence identity, however, all CYP450 enzymes possess a
common tertiary fold [28] (see Figure 3.1 on page 10), which makes it possi-
ble to derive the structure of an unknown CYP450 protein from on a known
one. Certain residues and the secondary structures are conserved in vary-
ing degrees throughout the CYP450 enzymes, although α-helices, β-strands,
turns and loops vary in length and amino acid constitution.
Differing from mammalian CYP450 enzymes, bacterial CYP450 enzymes
are not bound to the endoplasmic reticulum but are soluble. This prop-
erty makes it easier to crystallise them and to determine the 3D structure by
X-ray diffraction methods. Once the first bacterial CYP450 enzyme struc-
ture, cytochrome P450 camphor monooxygenase from Pseudomonas putida
(CYP450cam), was solved, in 1985 [33], the homology modelling of CYP450
enzymes could commence. The first aromatase models were based on this
and other bacterial crystal structures. Graham-Lorence et al. [34] created
an active site model by aligning the sequences of aromatase and CYP450cam.
The residues in the heme binding region and the I-helix of CYP450cam were
replaced by residues from aromatase. The other parts of the protein were
15
Chapter 3. CYP450 aromatase
deleted and energy minimisations were performed. Examination of the
model and the orientation of camphor in the CYP450cam crystal structure
suggested the orientation of the substrate androstenedione (ASD). The ac-
tive site model was also used to study the effects of mutagenesis experi-
ments performed on aromatase (see Section 3.4 on page 20).
Zhou et al. [35] developed an active site model for aromatase using a similar
approach, that is, replacing the residues of CYP450cam in the I-helix with
those of aromatase. The resulting I-helix was then energy minimised and
the substrate ASD was placed in the active site. A similar positioning of
ASD as in the work of Graham-Lorence [34] was suggested.
Laughton et al. [36] in 1993, were the first to use molecular dynamics sim-
ulations (MDS) in relaxing and validating the aromatase model. The se-
quences of aromatase and CYP450cam were aligned using an automated mul-
tiple alignment procedure [37], and the secondary structure elements were
predicted by the algorithm developed by Zvelebil et al [38]. The backbone
coordinates of the core parts of aromatase were assigned from the template,
and the loop backbone coordinates were generated with the loop-fitting pro-
cedure in QUANTA software [39]. The coordinates for the side chains were
assigned from the template, where applicable, and the rest of the side chains
were assigned arbitrary conformations. The system was energy minimised
starting from the least-well-defined parts and gradually moving to the bet-
ter predicted parts. Simulated annealing (a series of heating and cooling
of the system) was applied to the loops and undetermined core structures.
ASD was inserted into the active site cavity, and a 10 ps MDS at 300 K was
performed. As a result, a model for the active site and the orientation of the
substrate were suggested.
Later, in 1993 and 1994, two more bacterial CYP450 enzyme crystal struc-
tures were solved (cytochrome P450 bacterial fatty acid monooxygenase
CYP450BM−3 from Bacillus megaterium [40] and cytochrome P450 α-terpineol
monooxygenase CYP450terp from Pseudomonas putida [41]). CYP450BM−3 was
16
3.3. Homology modelling of aromatase
the first class II CYP450 enzyme, that is, a CYP450 enzyme interacting with
an NADPH-P450 reductase, whose structure was solved. These new struc-
tures provided more information on the sequence alignments between aro-
matase and the templates, and therefore, on the 3D structure of aromatase.
On the basis of the new sequence alignment, a new core structure was sug-
gested for aromatase [42]. A de novo structure construction was combined
with replacement of CYP450BM−3 residues by ones from aromatase. Min-
imisations and MDS for a few picoseconds in vacuum were applied to relax
the model. The orientation of the substrate ASD was also suggested.
Koymans et al. [43] developed a model of aromatase using CYP450cam as
the template. After sequence alignment, ASD and several non-steroidal
aromatase inhibitors were manually placed inside the active site and the
complexes were energy minimised. Analyses of the minimised structures
indicated that the C-terminus of the aromatase sequence (471-486) is an im-
portant part of the active site.
A CYP450BM−3 based model was generated by Kao et al. [44] using an
alignment from Nelson [45]. The alignment was manually refined and main
chain coordinates of the core region were taken from the template. The loop
areas were searched from a database and the side-chain conformations were
predicted on the basis of a local homology relationship [46] followed by min-
imisation. The generated model was compared with the CYP450cam -based
model of Laughton et al. [36]. Because of differences in the lengths of helices
F and G, and thus, difficulties in positioning the substrate ASD, the authors
preferred the model based on CYP450cam.
Lewis and Lee-Robichaud [47] generated homology models for four steroido-
genic CYP450 enzymes (cholesterol side-chain cleavage enzyme, 17-α-hy-
droxylase, aromatase and 21-hydroxylase) based on the crystal structure of
CYP450BM−3. After a computer-assisted sequence alignment, a manual re-
finement was performed and the models were generated by the Biopolymer
module implemented in the Sybyl software [48]. The crude models were en-
17
Chapter 3. CYP450 aromatase
ergy minimised, and substrates and inhibitors were docked into the models.
The results were in broad agreement with mutagenesis data at hand.
Cavalli et al. [49] generated an aromatase model based on an alignment
taken from Graham-Lorence et al [42] that was refined by aromatase sec-
ondary structure predictions from the PsiPred server [50, 51]. The authors
used the structure of CYP450cam with a bound phenylethylimidazole deriva-
tive because they were generating a model for the interaction between a
specific class of inhibitors and aromatase. SCR regions of aromatase were
generated from the template, while the SVR regions were built de novo. Be-
fore the optimisation of the generated structures, two inhibitors were placed
in the active site, using their crystal coordinates and interacting them with
the heme iron. After energy minimisations, the complexes were subjected to
molecular dynamics simulations for 100 ps at physiological temperature to
relax and optimise the models. Average structures from the stable simula-
tion trajectory were used to analyse the interactions between the inhibitors
and aromatase. The interactions were found to be similar to those presented
earlier by Koymans [43]. Additionally, a 3D QSAR model was generated by
the comparative molecular field analysis method (CoMFA).
Auvray et al. [52] also used the alignment from Graham-Lorence et al [42].
The core region coordinates were taken from the template (CYP450BM−3)
and the loops were searched from a database. The crude models were en-
ergy minimised, after which the substrate ASD was placed inside the active
site and the complex was minimised. The effect of several mutations of aro-
matase was studied with the help of the model.
The first mammalian CYP450 enzyme structure was solved in 2000 [13]. This
was CYP2C5, rabbit 21-progesterone hydroxylase, which catalyses the hy-
droxylation of C21 of progesterone. Because both aromatase and CYP2C5
are membrane-bound mammalian CYP450 enzymes with a steroid as a sub-
strate, it was clear that the newly solved structure provided a better tem-
plate for homology modelling of human aromatase than the soluble bac-
18
3.3. Homology modelling of aromatase
terial CYP450 enzymes. The first homology model based on this template
was reported by Chen et al. [53], though neither the sequence alignment be-
tween CYP2C5 and aromatase nor the details of the modelling process were
published. Loge et al. [32] also built a model based on CYP2C5, but used
a crystal structure with a bound inhibitor (PDB entry 1N6B [54]). The se-
quence alignment was generated with ClustalW [55] and refined using the
PsiPred server [50, 51]. The actual aromatase model was generated with the
Nest program implemented in Jackal software [56, 57]. The crude model
was energy minimised gradually from the core side chains to the core main
chains, after which two inhibitors were docked into the active site of the
enzyme using GOLD [58]. The results from the docking studies were in
agreement with previously reported results [59, 60, 61, 53, 62].
The crystal structure of the first human cytochrome P450 enzyme, CYP2C9,
was solved in 2003 [63]. CYP2C9, a drug-metabolising CYP450 enzyme, was
subsequently used by Favia et al [64] as a template for aromatase homology
modelling. The sequence alignment was done with the LALIGN server [65],
and secondary structure elements were predicted with the PsiPred server
[50, 51]. Several models were generated using the MODELER software [66].
The top ranking model was validated with MDS (10 ns simulation time) us-
ing the AMBER8 software [67]. In addition to the aromatase model, the au-
thors used DFT calculations to develop parameters for the protoporphyrin
IX ring with bound iron and cysteine coordination.
A model based on CYP2C5 was developed by Murthy et al. [68] to study
the acidic residues located in the active site of aromatase. However, the se-
quence alignment was not published. The model was built with the MOD-
ELER software [66] and energy minimised for a starting structure for MDS.
The simulation was performed in water for 1 ns using AMBER6. The acid-
ity of certain active site residues was predicted with and without natural
substrates.
The model described in detail in publication I was built on CYP2C5, a mam-
19
Chapter 3. CYP450 aromatase
malian CYP450 enzyme that has a steroid as a substrate. The crystallised
human CYP2C9 was also available, but as it is not a steroid-metabolising
CYP450 enzyme, its use as a template was rejected.
3.4 Mutagenesis studies on aromatase
The aromatase enzyme has been a target for nearly 80 mutagenesis studies.
Most of these were performed to identify the residues contributing to the
active site and the catalytic mechanism, while some involved mutations to
residues outside the active site to clarify the effect on the overall tertiary fold
(and the function) of the enzyme. In view of the large number of studies per-
formed, only the results for the most important residues are discussed in the
following. Most mutagenesis investigations have been performed in whole-
cell assays, in which the kinetic constants KM (Michaelis-Menten constant;
the substrate concentration at which the enzymatic reaction velocity is half
of its maximal value) and Vmax (the maximal reaction velocity) may be af-
fected by intracellular actions. Although microsomal preparations in vitro
would, thus, be more reliable in determining kinetic constants, in vitro as-
says have the disadvantage of not representing living cells. Unless other-
wise indicated, the substrate in the assays discussed below is androstene-
dione (ASD). All the mutagenesis studies performed to date are listed in
Table 3.1.
Most of the mutated residues are located at SRS-1 and SRS-4. SRS-1 is the
B–C loop (Figure 3.3 on page 12) forming one wall of the active site and
including a B’-helix in some of the CYP450 enzymes.[28] There are two hy-
drophobic residues that are thought to interact directly with the substrate,
namely I133 and F134. The mutation of I133 to bulkier residues (I133W
and I133Y, [44]) indeed leads to a dramatically reduced Vmax and an in-
creased KM in microsomal assay, indicating that the volume of the active
site is diminished and, therefore, less favoured for the substrate. Amarneh
20
3.4. Mutagenesis studies on aromatase
et al. [69] found that mutant F134E, where a bulky hydrophobic residue is
changed to a negatively charged one, has similar KMbut higher Vmax value
than the wild-type enzyme. This indicates that the F134E mutation leads
to a more active enzyme, although the affinity for the substrate remains ap-
proximately the same. It was suggested that a negatively charged residue at
this position takes part in the catalytic mechanism. The group also designed
a double mutant F134E|E302A to find out if a negatively charged residue at
134 could replace the functionality of a negatively charged residue at 302.
However, the results of this study have not yet been published.
Table 3.1 – Mutagenesis studies reported to date for the aromataseenzyme.
Mutationa SRSb Ref. Mutationa SRSb Ref. Mutationa SRSb Ref.-10Nc – [69] L228E 2 [69] R365A – [70]-20Nd – [69, 70] K230A 3 [62] R365K – [70]T14A – [69] K230R 3 [62] V388E – [69]F116E – [69] F235L 3 [44] I395F – [44, 71]K119E 1 [52] F235N 3 [72] F406R – [73, 74]K119T 1 [52] C299A 4 [35] H457N – [52]K119V 1 [52] E302A 4 [52] S470N – [52]K119Y 1 [52] E302D 4 [34, 62, 75] I471M – [52]G121A 1 [72] E302Q 4 [34] I474F – [72]Q123E 1 [70] E302V 4 [34] I474M – [44]Q123H 1 [70] E302A 4 [34, 52] I474N – [44]C124Y 1 [52] E302L 4 [35] I474T – [52]I125M 1 [52] GAGVe 4 [34] I474W – [44]I125N 1 [72] A307G 4 [34] I474Y – [44, 71]H128A 1 [35] P308F 4 [73, 76, 74, 71] H475A – [52]H128Q 1 [35] P308V 4 [34] H475E – [52]K130N 1 [52] D309A 4 [35, 76, 74, 71, 52, 75] H475R – [52]I133M 1 [77] D309N 4 [35, 76, 74, 69] D476A – [52]I133W 1 [44] D309V 4 [69] D476E – [52]I133Y 1 [44, 71] T310A 4 [69] D476K – [52]F134E 1 [69] T310C 4 [70] D476L – [52]K216A 2 [62] T310S 4 [70, 69, 44] D476N – [52]K216R 2 [62] S312C 4 [35] S478A 6 [62]Q225A 2 [69] F320C – [52] S478T 6 [62, 75]Q225N 2 [69] Y361F – [73] H480K 6 [62]L228A 2 [69] Y361L – [73] H480Q 6 [62, 75]
aFor a list of amino acids and their abbreviations, see page 97bDefined for CYP2 family enzymes [27]cDeletion of the first 10 N-terminal residuesdDeletion of the first 20 N-terminal residueseI305G, A306A, A307G, P308V
21
Chapter 3. CYP450 aromatase
The residues located at SRS-4 (Figure 3.3 on page 12), that is, the centre of the
I-helix, form an important part of the active site and most likely participate
in the catalytic mechanism. Not surprisingly they have been the subject of
several mutagenesis studies. The mutation of the negatively charged E302 to
hydrophobic residues Ala, Leu or Val or to a polar residue Gln led to unsta-
ble or inactive enzymes.[34, 35, 52] Moreover, E302D exhibited dramatically
decreased activity compared with that of the wild-type enzyme, suggesting
that even an Asp cannot properly replace the functionality of a Glu at this
position.[34, 62] P308 is an interesting residue located at the centre of the
I-helix. Because it cannot form a a hydrogen bond from the backbone nitro-
gen to the adjacent carbonyl oxygen of L304, it creates a bend in the helical
structure. The P308F mutant decreased the Vmax for ASD and T suggest-
ing an altered active site.[73] KM was decreased significantly for ASD and
slightly for T, indicating that ASD binds more strongly to the mutant than
to the wild-type enzyme. In another study [76], however, the same muta-
tion P308F reportedly increased the KM value for ASD 4-fold rather than
reducing it. The mutant P308V, in turn, had a Vmax value for ASD half that
of the wild-type enzyme, and the KM was increased 4-fold.[34] If a valine
at this position does not cause a bend in the I-helix, as is suggested by the
secondary structure prediction programs, then a bend in the I-helix is nec-
essary for the optimal function of the enzyme. The necessity of the bend is
hard to prove, however, owing to the diverse results from different mutage-
nesis studies.
D309 has been the subject of several mutagenesis studies, aimed at revealing
the importance of a negatively charged residue at this position. The muta-
tion of D309 to a hydrophobic residue Ala or Val led to almost total inactivity
of the enzyme, as shown by the decrease in Vmax values.[35, 76, 74, 44, 52, 69]
However, widely ranging KM values for mutants D309A and D309V have
been obtained by different research groups. Vmax for the D309N mutant was
not as dramatically decreased as that for the D309A or D309V mutant. Also
the KM for D309N was lower than for the wild-type enzyme, suggesting that
22
3.5. Phytoestrogens as aromatase inhibitors
a polar residue can serve the function of a negatively charged residue at this
position, at least to some extent.[70] It should be mentioned that although
mutant D309E has not been constructed, a Glu is located at this position in
CYP450BM−3 and in human and bovine CYP45017α, showing that a negative
charge is important at this position. T310 is conserved in all CYP450 en-
zymes except CYP450eryF[78], being located in SRS-4 and immediately next
to the heme group. T310 is thought to take part in the catalytic mechanism
and/or to provide a hydrogen bond to the backbone carbonyl of A306, thus
stabilising the bend formed by P308.[70, 71] Mutants T310A and T310C re-
sulted in the decrease or total loss of the activity.[69, 70]. A conservative
mutant T310S showed decreased activity compared with that of the wild-
type enzyme, but it is still functional.[70, 69, 44]
Residues at SRS-6 are part of a hairpin turn of β-sheet 4 and in aromatase
include residues S478 and H480. The Vmax values for the mutants S478A,
S478T and H480Q were significantly reduced as compared with tha of the
wild-type enzyme in both in-cell and microsomal assays.[62, 75] However,
in microsomal assay, H480K had closely similar activity to that of the wild-
type enzyme, suggesting that a positively charged residue is needed at this
position.[62]
As described above, much research has been done to identify the residues
contributing to the aromatase active site in an important way. Some of the
results are conflicting, probably because the groups used different test con-
ditions. A great deal of valuable information about the active site of aro-
matasehas nevertheless been aqcuired. In time, an experimentally solved
aromatase structure will prove the theories right or wrong.
3.5 Phytoestrogens as aromatase inhibitors
Phytoestrogens are polyphenolic compounds widely present in nature. Many
plants included in the human diet, soy, flaxseed, berries, grapes etc., are
23
Chapter 3. CYP450 aromatase
good sources of phytoestrogens.[79, 80, 81, 82, 83] Their dietary intake has
been associated with reduced breast cancer risk.(reference [84] and refer-
ences therein). The two major classes of phytoestrogens are flavonoids (in-
cluding subclasses flavones, flavanones and isoflavonoids) and lignans (lig-
nanolactones and lignanodiols) (Figure 3.7). Phytoestrogens mimic endoge-
nous estrogens in shape, size and physico-chemical properties making it
possible for them to bind to the same receptors and enzymes as do the estro-
gens. The similarity between estradiol and 6,4’-dihydroxyflavone is shown
in Figure 3.8. An extensive of aromatase inhibition by phytoestrogenic com-
pounds and their use in drug discovery has been published by our research
group.[85]
OH
OH
lignanolactones lignanodiols
77
6655
88
4433
22O
O
4'4'3'3'
A
B
C
O
O
flavones isoflavones
8'8'
88
7'7'
77
1'1'
11
2'2'3'3'
4'4'
4433
22
99
O
9'9'
O
A
B
Figure 3.7 – The general structures of lignanolactones, lignanodi-ols, flavones and isoflavones with stereochemistry and atom num-bering shown.
24
3.5. Phytoestrogens as aromatase inhibitors
Figure 3.8 – The similarity of shapes and the oxygen atom loca-tions in E2 (left) and in 6,4’-dihydroxyflavone (right). The oxygenatoms are coloured red and the carbon skeleton is coloured grey.Hydrogens are omitted for clarity.
Figure 3.9 – Left: The optimal coordination angle and average dis-tance (see publication II) in a carbonyl coordination to the hemeiron. Right: The optimal tilting angle between the heme plane andthe carbonyl group plane. Carbons are coloured grey, oxygen iscoloured red and the heme iron is coloured magenta. Heme sidechains are omitted for clarity.
25
Chapter 3. CYP450 aromatase
Natural phytoestrogens have been widely studied for their aromatase in-
hibiting properties. As a rule, flavones are better inhibitors than isoflavones
and lignans. It has been suggested, on the basis of spectroscopic stud-
ies, that the flavonoids coordinate to the heme iron via the C4 carbonyl
oxygen.[86, 87, 71, 88] A theoretical optimal binding geometry is achieved
when the Fe–O=C angle is 120◦ and the tilting angle between the heme plane
and the carbonyl plane is 90◦. Additionally, a database search in the Pro-
tein Data Bank (PDB) yielded an experimental average distance of 0.19 nm
between the iron and the oxygen (see publication II). The optimal binding
geometry is shown in Figure 3.9. The dihedral angle N-Fe-O-C also plays a
role in the optimal binding geometry, in that the functional groups attached
to the carbonyl carbon should not lead to sterical clashes with the protopor-
phyrin ring.[89] The best natural flavonoid inhibitor is α-naphthoflavone
13 (ANF) (Figure 3.10), which has an IC50 value of 0.07 µM in human pla-
cental microsome-based assay.[86] Substitutions in the flavonoid skeleton
affect the binding strength in various ways. 7-Hydroxyflavone is a more
potent aromatase inhibitor than flavone, as can be seen from reported IC50
values 0.5 µM and 10 µM [90], and 0.21 µM and 48 µM [91], respectively.
Hydroxylation at C7 can overcome the unfavourable influence of other sub-
stitutions, as seen from the IC50 values of 4’-hydroxyflavone (180 µM) and
7,4’-dihydroxyflavone (2 µM) [90]. A hydroxy group at C5 or C3 forms a
hydrogen bond to the C4 carbonyl, weakening its ability to coordinate to
the heme iron. This can be seen by comparing the IC50 values of flavone
(10 µM), 3-hydroxyflavone (140 µM) and 5-hydroxyflavone (100 µM).[90]
Isoflavonoids have the phenyl ring attached to C3, instead of to C2 as in
flavones, and as a result the C4 carbonyl coordinating to the heme iron is
sterically hindered (Figure 3.7 on page 24 and publication II). The best isofla-
vonoid inhibitor is biochanin A with an IC50 value of 15.5 µM in a cell-based
assay.[71] For a complete list of the effects of the substitution on biological
activity, see reference [85] and publication II.
Lignans have been shown to be competitive, though weak inhibitors of
26
3.5. Phytoestrogens as aromatase inhibitors
secoisolariciresinol SI 17
OH
OHHO
OHO
O
matairesinol MAT 16
O
OHO
OHO
O
nordihydroguaiaretic acid NDGA 18
HO
OHO
O
enterodiol END 15
OH
OH
enterolactone ENL 14
O
O
HO
HO HO
HO
O
O
α-naphthoflavone ANF 13
O
O
OH
OHO
coumestrol 19
Figure 3.10 – The structures of the enterolignans ENL and ENDand their dietary precursors MAT 14 and SI, and the structures ofNDGA , ANF and coumestrol.
the aromatase enzyme in microsome and cell-based assays.[92, 93] The en-
terolignans enterolactone 15 (ENL) and enterodiol 16 (END) are produced
by the human gut microflora from their dietary precursors matairesinol 14
(MAT) and secoisolariciresinol 17 (SI), respectively (Figure 3.10) [94, 95, 96,
97], and these enterolignans and their precursors are responsible for the
aromatase inhibition in the human body. Although the binding modes of
these compounds have not yet been proposed, their inhibition has been
measured.[92, 93] The lignanolactones bear a sterically unhindered carbonyl
oxygen capable of coordinating to the heme iron. However, their relatively
large size and flexible structure compared to that of the flavonoids proba-
27
Chapter 3. CYP450 aromatase
bly distorts the binding geometry and results in weaker binding (see publi-
cation II). The lignanodiols, which lack the carbonyl group, probably bind
to the iron via a hydroxy group. Note that nordihydroguaiaretic acid 18
(NDGA, Figure 3.10), which lacks hydroxy groups at the same positions
where ENL and END have potentially coordinating oxygen atoms, is a rela-
tively strong aromatase inhibitor among the lignans.[92]
Coumestrol 19 is a tetracyclic polyphenol belonging to the class of coumes-
tan compounds (Figure 3.10). Coumestrol was first identified in alfalfa and
has later been found in plants, such as soya and cow pea, included in the
human diet.[98, 99] Coumestrol has been shown to inhibit aromatase activ-
ity with an IC50 value of 17 µM and a Ki value of 1.3 µM , being thus a weak
inhibitor.[93] Mutagenesis studies on aromatase made with coumestrol have
shown that mutants S478T and H480Q have reduced binding affinity for
coumestrol [75] (see Section 7.1 on page 61).
3.6 Commercial aromatase inhibitors
One approach to treating hormone-dependent breast cancer is to block the
biosynthesis of estrogens through the inhibition of aromatase or 17β-HSD1.
Inhibitors of the 17β-HSD1 enzyme have not yet reached the market, while
even though the 3D structure of aromatase has not yet been experimentally
solved, a few drug discovery projects have succesfully led to commercial
aromatase inhibitors (AIs) (Figure 3.11). Already, three AIs are on the mar-
ket to treat post-menopausal breast cancer by inhibiting the local estrogen
production in the tumour tissue.[100, 101] Exemestane 20 (Aromasin R© from
Pfizer) is a type I inhibitor, a steroid derivative functioning as a suicide in-
hibitor. It is recognised by the aromatase active site and the catalytic reaction
produces an intermediate that covalently binds to the enzyme, permanently
inactivating it. Letrozole 21 (Femara R© from Novartis) and anastrozole 22
(Arimidex R© from AstraZeneca) are type II inhibitors that are non-steroidal
28
3.6. Commercial aromatase inhibitors
and bind reversibly to the aromatase enzyme. The type II inhibitors have a
sterically unhindered triazolyl group, whose nitrogen coordinates through
its free electron pair to the heme iron. Both type I and type II AIs are being
used in post-menopausal breast cancer treatment.
O
O
NN
N
N N
NN
N
N
N
exemestane 20 letrozole 21 anastrozole 22
Figure 3.11 – The structures of the commercial aromatase in-hibitors.
29
Chapter 4
17β-Hydroxysteroid
dehydrogenase type 1
17β-Hydroxysteroid dehydrogenases are a group of enzymes catalysing the
interconversion between the C17 functionality in 17β-hydroxysteroids and
17-ketosteroids.[102] 17β-Hydroxysteroids are the biologically active forms
of these hormones both in androgens and in estrogens.[103] The reductive
catalytic reaction requires cofactor nicotinamide adenine dinucleotide phos-
phate NADP(H) (Figure 3.2 on page 11) or its dephosphorylated form nicoti-
namide adenine dinucleotide NAD(H). There are 14 subtypes in the 17β-
hydroxysteroid dehydrogenase family, of which 12 are found in humans,
distinguished by substrate and cofactor specificity.[104]
The 17β-hydroxysteroid dehydrogenase type 1 enzyme (17β-HSD1), en-
coded by the gene HSD17B1, plays a crucial role in the female hormonal
regulation by catalysing the NADPH-dependent reduction of the less po-
tent estrone (E1) into the biologically active estradiol (E2) (Figure 2.3 on
page 7). Androgens are also converted, but to a minor extent.[102] 17β-
HSD1 has been crystallised as an apoenzyme and a holoenzyme and with
estrogens, androgens and synthetic inhibitors. [105, 106, 107, 108, 109, 110,
111, 112, 113] Human 17β-HSD1 enzyme is active as a soluble cytosolic
31
Chapter 4. 17β-Hydroxysteroid dehydrogenase type 1
homodimer and is mainly expressed in the ovaries and placenta and in
liver and breast tissue.[102] 17β-HSD1 is also expressed in malignant breast
tissue,[114] where it has been suggested to have an important role in in situ
E2 production.[115, 116, 117, 118]
4.1 Structure and catalytic mechanism
of 17β-HSD1
17β-HSD1 is a member of the short-chain alcohol dehydrogenase super-
family (SDR) and has the conserved and catalytically crucial Tyr-x-x-Ser-
Lys sequence in the active site (catalytic triad Y155, S142 and K159).[103]
Investigations with bacterial hydroxysteroid dehydrogenases suggest that
the near-by asparagine N114 should also be included among the catalytic
residues.[119, 120] One enzyme monomer contains 328 amino acids with a
molecular weight of ∼35 kDa. The tertiary fold of the enzyme contains the
substrate binding domain and a Rossmann fold — a wall of β-sheets sur-
rounded by α-helices, which is associated with cofactor binding.[121] The
catalytic residues, including Y155, S142 and K159 (and N114), are located
near the nicotinamide moiety of the cofactor NADPH.[122] The active site is
a narrow tunnel consisting of lipophilic residues and is complementary in
shape and volume to the steroid skeleton.[123] The substrate E1 binds to the
cavity by forming hydrogen bonds to residues Y155 and S142 from the C17
carbonyl and to H221, at the opposite end of the cavity from the C3 hydroxy
group.[107] An additional hydrogen bond may be formed from E282 to the
C3 hydroxy group. The 17β-HSD1 enzyme contains a loop near the active
site (residues 186–199) for which several conformations have been observed
in crystal structures. This loop has been associated with substrate entry and
cofactor stabilisation.[124, 109] The overall structure of the enzyme and the
active site with selected residues are displayed in Figure 4.1.
A catalytic mechanism has been proposed for the conversion of E1 to E2
32
4.1. Structure and catalytic mechanism of 17β-HSD1
Figure 4.1 – The tertiary structure and active site of 17β-HSD1,PDB entry 1FDT. The cofactor NADP+ is coloured blue and E2 ma-genta. For clarity, only selected active site residues are displayedand the secondary structures are displayed in transparency.
(Figure 4.2).[106] K159 lowers the pKa value of the Y155-OH group, facili-
tating the proton transfer from Y155-OH to the C17 carbonyl oxygen. The
electrophilic attack of the Y155 proton on the C17 carbonyl oxygen and the
hydride transfer from NADPH to C17 results in the formation of E2. S142
has been proposed to stabilise the oxyanion in Y155 [106] or to provide a hy-
drogen bond counterpart for the C17 carbonyl [102]. It should be noted that
K159 is too far from Y155 to interact with it in the crystal structures. How-
ever, a small repositioning would accomplish a suitable orientation.[106] A
network of solvent molecules is proposed to provide the missing proton.
33
Chapter 4. 17β-Hydroxysteroid dehydrogenase type 1
HO
O
NH
H
HO
Y155
N K159H
S142 OH
HO
H
estrone E1 8 NADPH
CONH2
H
H
H2NN114
O
Figure 4.2 – The proposed catalytic mechanism for 17β-HSD1 con-version of E1 to E2. Redrawn and modified from reference [106].
4.2 Molecular modelling of 17β-HSD1
The 17β-HSD1 enzyme is an attractive target for controlling the production
of E2 in the human body and so attacking breast cancer. Yet surprisingly few
drug discovery projects have focused on 17β-HSD1, even though the crys-
tal structure of the enzyme was reported in 1993 [105] and the first structure
was deposited in the PDB in 1995 and released in 1996.[106] The steroidal
skeleton of E2 has been a target for modification with the aim of finding po-
tent and selective inhibitors of 17β-HSD1 . Modification of the substituents
at C16 and C2 led to potent inhibitors [125], which were docked into the
active site of 17β-HSD1. The steroidal moiety of the inhibitor was found to
superimposed with E2 in the crystal structure of the enzyme, and hydrogen
bonds were formed between a substituent at C16 and residues normally
interacting with the cofactor. The study was later extended to include a
substituent at C17, yielding the C5’ amide pyrazole derivative of E1 23 (Fig-
ure 4.3).[126] With use of the inhibitory data and FlexS [127] ligand align-
ment, a QSAR model was built and validated with an external test set. The
QSAR model was later refined with data from both previously published
and newly synthesised compounds.[128, 129] The compounds were again
aligned using FlexS[127], and a comparative molecular similarity indices
analysis (CoMSIA) was performed. The model was validated with an exter-
34
4.2. Molecular modelling of 17β-HSD1
nal test set, which suggested favourable and unfavourable regions in regard
to hydrogen bond acceptors and donors, and regions favouring positive and
negative charge.
OH
O
O
HO
O
OH OH
N
NN
N
NH2
25
HO
24
F
FF
HO
23
NHNHN
O
SN
NO
O
OBr
O
HO
27O
OHN
O BrBr
NNH
O NH2
N
28
HO
26
O
Figure 4.3 – Selected inhibitors from molecular modelling studiesof 17β-HSD1.
In another study, a highly electronegative fluorine substituent at C17 of the
E2 skeleton was tested for 17β-HSD1 inhibitory activity.[130] With a fluorine
at C17, E2 is thought to mimic the transition state of the enzymatic catalysis.
Other, mostly hydrophobic substituents at C2, C8, C9, C13, C14, C15, and
C18 were included in the study. The synthesised compounds were tested
for inhibitory activity against human HSD subtypes 1, 2, 4, 5 and 7. Only
limited selectivity towards 17β-HSD1 was observed. The most potent 17β-
HSD1 inhibitor 24 (Figure 4.3) with three fluorine substituents at C7 and C17
was docked into the active site of the 17β-HSD1 enzyme crystal structure
35
Chapter 4. 17β-Hydroxysteroid dehydrogenase type 1
(PDB entry 1FDT). A binding mode was suggested, where the 17β-fluorine
accepts hydrogen bonds from S142 and Y155. As well, a hydrophobic pocket
for 7α-fluorine was identified.
The first effort to design a hybrid inhibitor, that is, a compound interact-
ing with the substrate and the cofactor binding sites, was reported in 2002
[113], and the idea was published in 2001.[131] With the substrate crystal
structure from the crystallised complex [108] and the cofactor modelled into
the apoenzyme structure [106], a part of the cofactor was deleted and con-
nected to the 16β-position of E2 with methylene linkers of different length.
The compounds were gradually minimised in the active site and in the ab-
sence of the enzyme structure. The energies of the minimisations for each
compound and for each degree of freedom were compared, and the optimal
spacer length was found to be 8 or 9 methylenes. Subsequent synthesis and
inhibitory assay showed that a linker length of 8 methylenes provides the
powerful and competitive inhibitor 25 (Figure 4.3). A crystal structure of
the enzyme with the newly created hybrid inhibitor was also solved and a
binding mode, making use of the substrate and the cofactor binding sites,
was proven (PDB entry 1I5R).[113] An extension to the previous study [113]
was conducted and several hybrid inhibitors, having a steroidal and an
adenosine moiety, were synthesised and tested for inhibitory activity.[132]
Later, new inhibitors based on 25 were synthesised and tested for inhibitory
activity.[133] The substrate E2 and a newly developed inhibitor 26 having
a benzylidene group at C16 (Figure 4.3), were docked into the active site
of the 17β-HSD1 crystal structure (PDB entry 1IOL). The result was that
the steroidal moiety of the inhibitor was shifted relative to E2 in the crys-
tal structure owing to the spatial requirements of the benzylidene group.
Because the new inhibitor was less potent than 25, it was speculated that
further modification of the benzylidene group would lead to a more potent
inhibitor.
The active sites of several crystal structures of the 17β-HSD1 enzyme have
been analysed with the SPROUT program.[134] The narrowness and hy-
36
4.2. Molecular modelling of 17β-HSD1
drophobicity of the active site tunnel were confirmed, and several starting
points for de novo inhibitor design were suggested.
A recent study concerning variously substituted pyrimidinones included
an analysis of the crystal structures of 17β-HSD1 and ideas for modification
of the lead compound derived from the analysis.[135] New inhibitors were
synthesised and tested and selective 17β-HSD1 inhibitors were identified.
The most potent of these was the pyrimidinone derivative 27 (Figure 4.3),
for which the binding mode was also suggested.
Fungal 17β-HSD from Cochliobolus lunatus (17β-HSDcl) has been suggested
as a useful model enzyme for human 17β-HSD1.[136] A QSAR study based
on flavonoids and cinnamic acid esters as 17β-HSDcl inhibitors was per-
formed with the use of quantum chemically calculated descriptors. Wide
differences between the experimental and predicted inhibitory activities were
found.
Pharmacophore modelling in connection with virtual screening is a rela-
tively new approach being pursued in drug discovery projects relative to
17β-HSD1. Schuster et al. [137] developed pharmacophore models based on
two crystal structure complexes: one with the steroidal ligand equilin and
one with the hybrid inhibitor 25. The authors used the automated pharma-
cophore generation program LigandScout [138, 139]. As a result of the phar-
macophore generation, database screening, synthesis and biological testing,
four inhibitors with IC50 values below 50 µM were identified, the most po-
tent being the hydrazinecarboxamide derivative 28 shown in Figure 4.3.
37
Chapter 5
Methods used in molecular
modelling
This chapter introduces the molecular modelling methods used in my work.
Only molecular mechanistic methods were applied, since only they are suit-
able for large molecular systems to be treated as a whole. Quantum chemical
methods can only be applied to much smaller systems than whole proteins,
although a combination of quantum chemical and molecular mechanistic
methods has successfully been used in modelling enzymatic reactions.[140,
141, 142, 143]
5.1 Force fields
The atom nucleus and the surrounding electron cloud are treated separately
in quantum chemistry, making it possible to model chemical reactions, but
only with massive computational resources. Quantum chemistry can, there-
fore, only be applied to relatively small molecules, and definitely not to large
biomolecules in solvent environment. The study of large molecular systems
in silico requires that simplifications be made, e.g., treating atoms as rubber
39
Chapter 5. Methods used in molecular modelling
balls joined together with strings. The properties of these balls and their
interactions are dependent on the selected force field. A force field is a set
of parameters and mathematical equations used to describe the properties
of atoms and their bonded interactions (bonds, angles, dihedral angles) and
atoms and their non-bonded interactions (van der Waals and electrostatic
interactions). The bonded parameters include the definitions of the atomic
masses and charges for different atom types, bond lengths, bond angles,
and dihedral angles. Several types of force field have been developed for
different modelling targets, e.g., for small molecules, biopolymers and lipid
membranes. Together the definitions of the parameters and the equations
define the behaviour and potential energy of the system under study.
In molecular mechanics, a bond between two atoms is described as a har-
monic oscillator between two particles. The potential energy reaches a min-
imum at a certain distance depending on the atoms in question, and it in-
creases exponentially when the distance is either decreased or increased
(Figure 5.1). The harminic potential is defined so that the bond is not al-
lowed to break. A special case is the Morse potential [144], where the poten-
tial energy curve is not quadratic, and at large distances between the atoms
the bond is allowed to break (Figure 5.2). This feature can be useful in sim-
ulating metal coordination bonds, as was done in publications II and III.
Angles are also described as harmonic functions. Torsion angles, however,
owing to their periodic nature, are described as cosine functions (Figure 5.3).
Bonded interactions can be described in terms of the stretching of bonds,
bending of angles and the rotation of torsion angles. The equation usually
used in calculating the potential energy arising from bonded interactions is
described [145] as
Ebonded =∑
bonds
Kb(b− b0)2 +
∑
angles
Kθ(θ − θ0)2 +
∑
torsion
Kχ[1 + cos(nχ− σ)]
where b is the distance between atoms, b0 is the optimum bond length and
40
5.1. Force fields
Figure 5.1 – The quadratic form of the potential used in bondstretching and angle vibration.
Kb is the force constant of the bond; θ is the angle of three atoms, θ0 is the
optimum angle and Kθ is the force constant of the angle; and n is the mul-
tiplicity of the torsion angle (indicating the number of energy minima at a
full turn of 360◦), χ is the value of the torsion angle, σ is the phase angle
(an angle where the potential is at its minimum value) and Kχ is the pa-
rameter defining energy barriers during the turn. Usually in biomolecular
force fields a term for improper dihedral angles is added and described as
a quadratic function. The improper dihedral term ensures that planar rings
stay planar and that the stereochemistry of a molecule is not flipped to its
mirror image during the dynamics run.
Non-bonded interactions between two atoms i and j are calculated with the
equation
Enon−bonded =∑
pairsij
(εij[(Rmin,ij
rij
)12 − 2(Rmin,ij
rij
)6] +qiqj
rij
)
The first term inside the square brackets is the van der Waals term (or Lennard-
Jones term) and represents the attraction and repulsion of two atoms mov-
ing closer to each other. Rmin,ij is the distance where the potential energy
between the two atoms is at minimum, and rij is the interatomic distance.
41
Chapter 5. Methods used in molecular modelling
Figure 5.2 – The Morse potential.
Figure 5.3 – The cosine form of the torsion angle potential.
When the two atoms are far away from each other, the negative term dom-
inates and the interaction is attractive. As the atoms get closer, the positive
term begins to dominate and the interaction becomes repulsive. ε is a pa-
rameter that depends on the atoms in question and defines the deepness of
the potential energy minimum.
The second term describes the electrostatic interactions and is called the
Coulombic term. qi and qj describe the partial charges of the atoms i and
j. Thus, the total energy of the system can be calculated from the equation
42
5.2. Homology modelling
Etot = Ebonded + Enonbonded + Eother
where Eother includes terms that are specific for a certain force field.
Restraints can be added in the course of the simulations to secure the geom-
etry of a certain part of the system. These include position, angle, dihedral,
distance and orientation restraints. The use of distance restraints adds a
penalty to the potential energy function if a certain distance between atoms
is exceeded.[146]
5.2 Homology modelling
Homology modelling is a powerful tool for predicting the 3D structure of a
protein. In contrast to threading, (usually) only one template is used. In ho-
mology modelling, a known 3D structure of a macromolecule (the template)
is used to derive the structure of an unknown macromolecule (the target).
The steps in the modelling process include the identification of proteins with
known 3D structure that are related to each other with known 3D structure
and selection of one of these for the template, identification of structurally
conserved regions (SCR) and structurally variable regions (SVR), sequence
alignment, assignment of coordinates to SCRs and to SVRs, optional side-
chain modelling, relaxation and validation of the final structure.
Templates that are evolutionally related to the target are found with the help
of software or by searching the literature, preferably both. The identifica-
tion of SCRs and SVRs should be based on literature studies, with the as-
sistance of software for secondary structure predictions.[51, 147] The meth-
ods predicting secondary structures include statistical methods and neural
networks.[148] The accuracy of these methods has increased significantly
43
Chapter 5. Methods used in molecular modelling
in the recent years thanks to the increase in the number of solved protein
structures and the development of algorithms.
Sequence alignment is a technique, where the primary amino acid sequence
of the template is aligned with a sequence of the target in order to find struc-
turally or functionally conserved elements. The goal is to match the cor-
responding residues between the sequences, and thus obtain information,
among other things, on the constitution of the target active site structure. If
the sequence similarity is high, i.e., only a few residues differ between the
sequences, the alignment is easy to perform. Conserved mutations between
the sequences are usually not a problem because the physico-chemical prop-
erties and the size of the mutated residue resemble the properties and size
of the original (e.g., Ile to Leu). Problems may arise, however, when a small
residue is mutated to a large one (e.g., Gly to Trp) or the properties of the
new residue are opposite to those of the original (e.g., Asp to Arg). In these
situations the alignment should be carefully examined, and experimental
data should be applied to validate the alignment.
Automated sequence alignments are usually made with dynamic program-
ming, where each residue in each sequence are put on x and y axes of a
table.[149, 150] Each residue is then compared with all other residues in the
other sequence and a similarity score is entered in the table. Finally, all
similarity scores are compared and the highest scores are used to find the
optimal alignment.
Sequence alignment is the single most important part of the homology mod-
elling process. With proteins sharing low sequence identity, this is by no
means a trivial task. Gaps and insertions must be introduced to the target
sequence to make the important residues match each other in the alignment.
While automated methods are available for this task, results from the soft-
ware should always be verified manually and they must correspond to the
experimental data from mutagenesis and similar studies. Popular multiple
sequence search and aligment programmes are FUGUE [151], BLAST [152],
44
5.2. Homology modelling
ClustalW [153] and T-Coffee [154].
After the conserved parts of the structures have been identified and aligned,
the atomic coordinates from the template are assigned to the model. First,
the SCRs (with or without the side chains) are generated by assigning co-
ordinates from the template. SVRs (loops and random coil areas) can be
built de novo or searched from the PDB [155], which provides connective
structures found in protein crystal structures. To end up with a reasonable
protein structure, the modeller needs to take special care when assigning co-
ordinates to sequence areas with gaps and insertions and find suitable con-
necting loops and coils. Identical side chains between the template and the
model are assigned the original conformation from the template. In the case
of similar side chains, the original conformation is preserved where appli-
cable, as in the coordinate assignment protocol in the Homology module of
Insight II.[156] Problems arise when, for example, a glycine is to be replaced
with an arginine. Because an arginine side chain is much larger that that of
the glycine proton, a non-clashing conformation has to be found. Several
conformations for arginine are browsed and the one that best fits the sur-
roundings should be selected. When every residue in the sequence has been
assigned coordinates, the side-chain conformations can be optimised with
the SQWRL [157] or a comparable program.
After sequence-structure alignment, the model structure must be relaxed
due to the clashes and distortions originating from the different spatial re-
quirements of the aligned residues and newly generated coil and loop areas.
Energy minimisation of the model structure is performed, where atom posi-
tions, bond lengths, bond angles and torsion angles are optimised. If the raw
structure of the model includes clashes or distorted bond/dihedral angles,
the structure can be minimised step-by-step. First, the heavy atoms are kept
at their initial positions and only the hydrogens are allowed to move. Sec-
ond, backbone atoms are fixed to their positions and the side chains are free
to move and, finally, all atoms in the model are free to move. Another pos-
sibility is to restrict the movement of heavy atoms by tethering constants,
45
Chapter 5. Methods used in molecular modelling
which are gradually decreased during the minimisation process.[158] The
result is a relaxed protein model, which can (in some cases) be used in fur-
ther studies. However, the minimised model should first be validated, with
molecular dynamics simulations, for example.
5.3 Molecular dynamics simulations
A powerful method for the validation of a homology model, and for study-
ing the motion and conformational changes of macromolecules, is molecu-
lar dynamics simulation (MDS). We used the GROMACS software package
[146], which is widely used, versatile, rapid and available free of charge to
academic users. The procedure for simulating a generated homology model
is as follows: the energy minimised protein model is placed in a simulation
box, the box is filled with water (and ions) and the system is energy min-
imised to obtain a reasonable starting structure for the system for MDS. The
system is assigned a certain temperature and the initial velocities are calcu-
lated for each atom and updated after pre-defined intervals for a pre-defined
number of times. First the system is equilibrated with restrained backbone
atoms to allow the side chains and solvent molecules to adapt to their en-
vironment. Alternatively, all heavy atoms of the protein can be restrained.
After the equilibration, the system is allowed to move freely, and the be-
haviour of the system is monitored in terms of potential energy conserva-
tion and, for example, backbone movement. The result is a time-dependent
trajectory, that is, an ensemble of protein conformations including confor-
mations representing local energy minima.
The time scales used in MDS are on the order of femtoseconds (10−15 s) for
each step. By repeating these steps millions of times, the time span cov-
ered in typical MDS for a reasonable-sized protein is on the order of tens
or hundreds of nanoseconds. With parallel supercomputing, microseconds
and even milliseconds can be achieved. In reality, normal protein folding
46
5.3. Molecular dynamics simulations
requires milliseconds, and even rapid folding takes microseconds.[159] It is
understandable, therefore, that the conformational space of a protein cov-
ered in MDS is limited and the resulting structures are local rather than
global minima.
5.3.1 Periodic boundary conditions
When a real protein is solvated, there are millions of protein molecules sur-
rounded by an even larger number of water molecules. In simulation sys-
tems, however, usually one protein molecule surrounded by a thin sphere
of water molecules is simulated to keep computational costs under control.
This simplification leads to unwanted effects on the surface of the simula-
tion box, very near to the protein. These surface effects can be avoided by
introducing periodic boundary conditions, in which the simulation box is
surrounded by images of itself in all dimensions. If the protein moves out
of the simulation box during the dynamics run, it moves back inside the box
from the opposite side. Thus, the system contains no boundaries, and sur-
face effects do not disturb the actual simulation. Different box shapes are
used to simulate reality as well as possible. A sphere would be the shape
of choice, but there is no periodicity in it. Earlier, a cubic box was the most
common box type, but nowadays a rhombic dodecahedron or a truncated
octahedron is preferred. The latter two are smaller than a cubic box with the
same image distance and thus save computational time.
5.3.2 Long-range interactions
Non-bonded atoms of a real system interact with each other over a large
distance, but these pair interactions cannot all be taken into account in sim-
ulations because the computational cost increases as the number of particles
squared. Usually a cutoff distance of 10 A is imposed. Within the cutoff,
47
Chapter 5. Methods used in molecular modelling
neighbour lists including the closest neghbours for all atoms are stored (and
updated after a few steps) and the interactions between the neighbours are
calculated. However, truncation of the interactions beyond the cutoff does
not provide a realistic situation, and smoothing functions, such as Particle-
Mesh Ewald method (PME) [160], are used to create a more realistic simula-
tion. In PME, the calculation of the coulombic interaction is divided into a
short-range term within the cutoff and a long-range term beyond the cutoff.
The calculation of non-bonded interactions is computationally a highly de-
manding task owing to the large number of particles, and even larger num-
ber of interactions in the system. The computational cost can be decreased
by introducing bond constraints like LINCS [161], where certain (or all)
bonds are constrained to their equilibrium value. Constraining the bonds
means that the bond vibrations are neglected, and the time step can be in-
creased so that the computational cost is reduced.
5.3.3 Temperature and pressure monitoring
The above-mentioned cutoff schemes and rounding errors in the calcula-
tions may lead to elevated temperature and pressure during the simulation.
This problem is typically controlled by coupling the system to an external
thermal bath, which scales the velocities and maintains the simulation tem-
perature. The pressure of the system is controlled by scaling the atomic
coordinates and the size of the simulation box.[162, 163]
5.4 MDS trajectory analysis and protein
model validation
MDS produces a time-dependent ensemble of protein conformations, usu-
ally converged to a local energy minimum. The dynamic properties of the
48
5.4. MDS trajectory analysis and protein model validation
trajectory can be analysed to validate the simulation and/or the generated
protein model. The root mean square deviation (rmsd) is a measure of how
close a certain structure in the trajectory is to the initial (e.g. crystal) struc-
ture. In proteins, the backbone or Cα rmsd is usually calculated covering
all frames of the trajectory. The potential energy of the system is also moni-
tored. In a well-behaving simulation, these two curves should stabilise dur-
ing the simulation. Another analysis tool is the root mean square fluctuation
(rmsf). If calculated per residue, it shows how much a certain residue has
moved during the simulation. Residue movements are very useful in iden-
tifying stable (helices and sheets) as well as flexible (loops, coils) parts of the
protein.
In further studies (e.g., docking), a static structure is required. The question
is which one to choose from the trajectory? One way to obtain a representa-
tive structure is by clustering. The stable part of the trajectory is clustered on
the basis of backbone rmsd, for example, and the representative structure of
the biggest cluster is then extracted from the water environment, minimised
and used for further studies. Another approach is to calculate an average
structure on the basis of the stable part of the simulation. However, the av-
erage structure does not represent any ”real” conformation of the trajectory.
Static properties of the representative structure can be analysed by a pro-
gram such as PROCHECK.[164] PROCHECK produces several analyses of
the quality of the structure, including main chain bond angles and lengths,
chirality, hydrogen bonds and disulfide bridges. Perhaps the most useful
feature is an analysis of the backbone dihedral angles φ, ψ and ω (Fig-
ure 5.4), displayed as a Ramachandran plot (Figure 5.5). This plot shows
a graphical representation of the backbone angle combinations frequently
observed in secondary structure elements in experimental structures, and it
reveals backbone conformations that are illegal. Illegal angles are also fre-
quently found in crystal structures, which is one reason why they should be
relaxed before use in further studies. The ω angles should all be in trans-
configuration, although cis-peptides are not unknown among experimental
49
Chapter 5. Methods used in molecular modelling
protein structures.
HN
Cα peptide chain
O
O
R
phi φ psi ψ
omega ωpeptide chain
Figure 5.4 – The peptide bond and the main chain angles phi φ,psi ψ and omega ω.
50
5.4. MDS trajectory analysis and protein model validation
Figure 5.5 – A Ramachandran plot of CYP2C5 (PDB entry 1NR6).The red area in the upper left corner shows the backbone confor-mations that are found in β-sheets; the conformations found in α-helices are shown in the red area in the middle-left and left-handedα-helices in the small red area in the middle-right. Residuesmarked with red squares have a bad conformation, which usuallydisappears during minimisation and/or dynamics simulation.
51
Chapter 5. Methods used in molecular modelling
5.5 Ligand–protein docking
Biological macromolecules interact with other molecules, either other macro-
molecules or small molecules. Small molecules interact with proteins in
several ways. To mention a few, substrates are modified by a protein (an
enzyme), inhibitors block the function of an enzyme (either covalently or
non-covalently), and agonists and antagonists influence the activity of a
biomolecular receptor. Common to all these functions is that the small mol-
ecule (the ligand) must be recognised by the macromolecule and bind to it,
that is, there must exist favourable interactions that overrule the energetic
states of the separate binding counterparts. Ligand–protein docking predic-
tions form one important area of molecular recognition studies. Docking
can be used in several ways: for example, to study the mechanism of an
enzymatic reaction, to identify possible binding modes for a ligand, and to
screen a database.[165]
Over one hundred years ago, Emil Fischer proposed his lock-and-key mech-
anism for ligand binding. In this model, the active site and the ligand are
considered as rigid objects matching each other in shape and size. Later,
in 1958, Daniel E. Koshland, Jr. proposed a process of dynamic recognition
between a ligand and an active site. In this model, the ligand is matched to
the receptor after the binding, that is, the shape and size of the receptor are
modified during the binding, while a complex is formed between the ligand
and the active site. Modern docking software treats the ligand as flexible
(or a library of rigid conformations is used), whereas the protein active site
is treated as a rigid or a semi-rigid cavity. The flexibility of the active site
is difficult to take into account in docking software: a proper sampling of
active site conformations would require the use of molecular dynamics sim-
ulations, simulated annealing or Monte Carlo methods. Because all these
methods are time consuming, their applicability is limited. One attempt
to take the flexibility of the active site into account involves letting the po-
lar hydrogens move during the docking, as in GOLD [166], or rotating the
52
5.5. Ligand–protein docking
side chains of user-specified residues, as in FlexScreen [167] and the Flex-
ible Docking protocol in Discovery Studio 2.1 [168]. However, taking the
backbone movements or even conformational changes of the protein into
account is beyond reach for fast molecular docking or virtual screening. I
tested several docking software in my work, and usually GOLD produced
the best results. However, as described in publication III, Surflex-Dock pro-
duced a better alignment of the 17β-HSD1 inhibitors.
Search methods and scoring functions
Finding the bioactive conformation of a ligand requires that the conforma-
tional space be searched, preferably fast and accurately. Two main search
methods are used in present docking software: systematic and stochastic.
In a systematic search the rotating bonds of the ligand are turned in small
increments and the lowest energy conformations are saved. These confor-
mations are usually used to create a library of rigid ligand conformations,
as in FRED. [169] In a stochastic search, the ligand is treated as a whole or
as a group of small ligand fragments.
In the first case, new conformations of the whole ligand are generated and
compared with the previous one. If a better (lower in energy) solution is
found, this is stored and the search is continued until some termination cri-
terion defined by the modeller is met. These searches include simulated
annealing and Monte Carlo methods. One approach in whole ligand treat-
ment is the genetic algorithm (GA) used in GOLD [166]. First, random con-
formations of the ligand are generated. Favourable features of the random
conformations are then combined and new offspring are produced, and the
best solution is found after the evolution of the ligand conformations. In
the fragment-based methods, the molecule is cut into fragments, the frag-
ment(s) are docked, and the ligand is ”grown” back by selecting suitable
conformations of the remaining molecular fragments. These methods re-
semble the de novo ligand design approach, where novel inhibitors are gen-
53
Chapter 5. Methods used in molecular modelling
erated from scratch. Examples of the fragment-based methods are the Sur-
flex method [170], where the end fragments of a ligand are fitted to a nega-
tive image of the active site, the so-called protomol, and the FlexX method
[171], where the fragments of the ligand are added to a rigid base fragment.
The search methods can also be divided into local and global methods. In lo-
cal methods a local minimum of the starting structure is searched, whereas
in global methods a larger sampling in the conformational space is per-
formed, which usually provides better solutions.
The scoring function plays an important role during a docking run. It is
a mathematical way of directing the docking routine and ranking the ob-
tained solutions. The scoring functions can be divided into empirical, force
field-based and knowledge-based functions.[172] Empirical scoring func-
tions, such as Chemscore [173] and Surflex-Dock [174], take into account
experimental data on binding affinities, force field-based scoring functions
calculate the van der Waals and electrostatic interactions between the lig-
and and active site atoms, and knowledge-based scoring functions, such as,
PMF score [175], apply statistical data from solved complex crystal struc-
tures. Additionally, solvation terms may be taken into account. There are
no simple preferences of one method over another, but the available scor-
ing functions are usually tested and the best one (that is, matching with the
experimental data) is chosen for the given data set. A common procedure
is the rescoring of the solutions. After a docking run, some other scoring
function is applied to rank the found solutions. A consensus in the scoring
functions produces the most probable solution.
5.6 QSAR methods
Quantitative Structure–Activity Relationship (QSAR) is a method where the
differences in structure of compounds are correlated with differences in
their biological activity. The most common use of QSAR is to predict the
54
5.6. QSAR methods
activity of a ligand based on a mathematical model derived from known
ligands. Usually tens of compounds with known activity are needed to cre-
ate a predictive model. A QSAR study begins with identifying compounds
that are to be used in the model building (the training set). These should be
structurally variable, share the same binding mode and cover a wide range
of activities, from inactive to highly active compounds. After the collection
of the training set, the compounds must be properly aligned. The align-
ment can be done with or without the target protein. In the absence of the
target protein, the compounds are usually aligned to the most active com-
pound, which should be in its binding conformation. If the target protein
is available, docking the training set is the method of choice. Irrespective
of the method, the alignment should be carefully studied and the important
structural features of the ligands (hydrogen bond acceptors and donors, hy-
drophobic cores, aromatic rings etc.) should appear as superimposed. The
partial charges for the atoms in the molecules need to be assigned in or-
der to take into account the electrostatic interactions. Each ligand is rep-
resented via descriptors, which is a fingerprint of, for example, the sterical
and electrostatic and other interactions. The interaction energies are cal-
culated with, for example, the GRID program.[176] Each compound in the
training set is put into a grid, and the interaction energies between a probe
and the compound are calculated. Probes are atoms or functional groups
that represent some feature of the active site. The most straightforward
QSAR method, Comparative Molecular Field Analysis (CoMFA) [177], uses
an sp3-hybridized carbon with a charge of +1 to encompass sterical and elec-
trostatic features of the compounds in the training set. Hundreds of other
probes are available for more sophisticated descriptor generation.[178] For
example, in the Comparative Molecular Similarity Indices Analysis (CoM-
SIA) method [179], the interactions that are used, in addition to the steric
and electrostatic fields, are hydrogen bonds and hydrophobic interactions.
After the descriptors have been calculated, a partial least-squares analysis
(PLS) [180] is performed to link the structural differences in the training set
55
Chapter 5. Methods used in molecular modelling
compounds to their biological activities. Usually, internal cross-validation
methods are applied before the final model is constructed. The leave-one-
out method generates a model that leaves one compound at a time out from
the training set, and predicts the activity of the excluded compound on the
basis of the generated model. This procedure is done for all compounds in
the training set, and the correlation between the predicted and actual data is
calculated. The correlation coefficient for the leave-one-out cross-validation
q2LOO gives an estimate of how predictive the model is. A value of over 0.6
should be obtained. Possible outliers (compounds with inaccurate or wrong
biological data) are usually identified at this point, and excluded from the
training set. Other cross-validation methods are leave-five-out and leave
20% out. The cross-validation also produces the optimum number of com-
ponents that should be used. The number of components is the number of
dimensions in an N dimensional space that includes maximum explanatory
data without including any irrelevant data. After a satisfactory q2 value is
achieved, a non-validated PLS run is performed to produce the final QSAR
model. Before predicting the activity of a compound whose biological data
is unknown, an external validation should be performed. This validation
should include a set of compounds that were not used in the model build-
ing but for which the biological data is known. The final model can be
considered useful if the activities of the compounds in the test set are pre-
dicted with good accuracy. In predicting the activity of a compound with no
known biological data, the candidate compound must be properly aligned
and charges assigned.
One of the strengths of a QSAR model is the visualisation of the interaction
fields from the final model. The fields (steric and electrostatic in CoMFA)
are usually contoured to a certain energy level, and displayed together with
a compound or the target protein. Visualisation of the fields offers valuable
information about the favourable and unfavourable areas surrounding the
compounds and may suggest new ways to modify the compounds. If a
protein structure is available, superimposition of the model with the protein
56
5.7. Pharmacophore modelling and virtual screening
active site not only further validates the model but also helps in identifying
those residues that are most critical for the binding. A QSAR model can
provide valuable guidelines as to which theoretical compounds are worth
synthesising or which functional groups in existing compounds should be
modified.
5.7 Pharmacophore modelling and virtual
screening
A pharmacophore is a spatial arrangement of physico-chemical features cor-
responding to a drug’s biological activity. Typical features are hydrophobic,
aromatic, hydrogen bond donor and acceptor, cation or anion and, positive
or negative ionizable. These features are represented as spheres of user-
defined radius in space. Hydrogen bond features also have a direction.
Exclusion spheres can be added to represent areas where, for example, an
active site residue side chain is located. Pharmacophores can be generated
automatically or manually, for the target protein alone (without a bound lig-
and), for one or a group of compounds (without a protein) or for a protein–
ligand complex. When only the target protein structure is known, the al-
gorithm analyses the active site and shows pharmacophoric features origi-
nating from the active site residues, as in Discovery Studio [168]. If the tar-
get protein structure is not known, the pharmacophoric features can be de-
rived from the ligands. In the best case the structure of a complex is known
and the actual interactions between the ligand and the active site can be
analysed, as in the LigandScout program.[138, 139] This program also pro-
vides accurate spatial restrictions of the active site, when exclusion spheres
are included in the pharmacophore. LigandScout was used in publication
IV because it is capable of generating pharmacophores from ligand–protein
complexes. In the absence of a complex, the Catalyst pharmacophore tools
implemented in Discovery Studio were employed.
57
Chapter 5. Methods used in molecular modelling
The main strength of a pharmacophore is that it can be converted to a search
query and used in virtual screening. A database of 3D structures of com-
pounds is required, and the pharmacophoric features are matched with the
features of the compounds. The user can usually define how many of the
features are not taken into account, and a mapping score for each compound
is calculated ranking the resulting compounds.[168] This virtual screening
method is fast and allows the identification of possibly active compounds
with a completely different scaffold than the existing compounds, and it
is therefore a valuable tool in finding new drug candidates. Validation
of the pharmacophore used in a screening experiment usually requires a
test screening with a database of compounds to which some known actives
and/or inactives are added. A successful validation run finds the actives
and rejects the inactives. As in all modelling work, the results should be
inspected and validated visually before one proceeds further in the drug
discovery process.
58
Chapter 6
Aims of the study
Steroid hormone chemistry is one of the main research topics of the Phyto-
Syn group at the University of Helsinki. A few years ago, molecular mod-
elling was incorporated as part of the multidisciplinary projects with the
aim of providing a broader view of the field. A sophisticated three-di-
mensional homology model of the aromatase enzyme was built to assist
understanding of the final steps in estrogen production. Additionally, an
atomic level explanation was needed for the different inhibitory activities
of phytoestrogens against aromatase. Docking experiments and molecular
dynamics simulations were used to study the binding of a series of 17β-
HSD1 inhibitors, developed in our laboratory, to the 17β-HSD1 enzyme .
The results of inhibitory activity measurements and dockings of these in-
hibitors provided the necessary data to build a 3D QSAR model for further
inhibitor development. To deepen our knowledge of the interactions occur-
ring during inhibitor binding and to find novel scaffolds for the design of
new 17β-HSD1inhibitors, pharmacophore modelling and virtual screening
were included in the study. Specifically, the aims of my work were
? to build a homology model of the aromatase enzyme and study the
interactions between it and phytoestrogens (I and II)
Chapter 6. Aims of the study
? to study the binding of synthesised 17β-HSD1 inhibitors by molecular
dynamics simulations and to derive a QSAR model on the basis of
biological data for these inhibitors (III)
? to find novel scaffolds for inhibitor design using pharmacophore mod-
elling and virtual screening (IV)
60
Chapter 7
Results and discussion
7.1 The aromatase model and the binding
of phytoestrogens (I and II)
After a few quiet years, the homology modelling of aromatase was resumed
in 2000 when the crystal structure of the first crystallised mammalian CYP450
enzyme CYP2C5 was published.[13] Two further crystal structures of the
same enzyme but with co-crystallised ligands were published in 2003.[54,
181] These structures provided the starting point for our goal to build a so-
phisticated homology model of the aromatase enzyme.
The process was started with the submission of the aromatase sequence
downloaded from the Swiss-Prot database [182] to the FUGUE server [151].
The initial alignment of aromatase with CYP2C5 (PDB entry 1N6B)[54] ob-
tained from FUGUE was manually refined on the basis of the PsiPred [183]
secondary structure element predictions and several mutagenesis experi-
ments [34, 35, 70, 69, 72, 71, 62, 52, 77]. Various alignments were generated
by focusing on the substrate recognition sites defined for the CYP2 enzyme
family by Gotoh [27]. The final alignment is shown in Figure 7.1.
61
Chapter 7. Results and discussion
Figure 7.1 – Alignment of the aromatase sequence to the CYP2C5sequence. The residues corresponding to α-helices are colouredred and β-strands green. Boxes denote areas where the coordinateswere assigned from the template.
The coordinates for the SCR regions, the heme, and loops of identical length
between the template and the model were assigned from the template, and
the remaining loops were searched from the PDB [155]. Splice points, that is,
connective peptide bonds between the SCRs and the loops, were optimised
to trans-configuration and to optimal bond length before the raw model
minimisation. Side-chain optimisation with an external software was not
performed because the Homology module of Insight II software [156] as-
signs the conformation of the original side chain to the new one where ap-
plicable. In cases where bad contacts were observed, new conformations
were fetched from a rotamer library of the Homology module. The raw
model was minimised, and equilibrated and simulated using MDS with
GROMACS. The stable part of the trajectory was clustered on the basis of the
backbone root mean square deviation (rmsd), and the representative struc-
ture of the biggest cluster was extracted and minimised. Subsequently, MDS
62
7.1. The aromatase model and the binding of phytoestrogens (I and II)
study was made with the natural substrate androstenedione (ASD) to find a
plausible orientation of the substrate.
The MDSs performed on an empty-cavity model and for the complex re-
sulted in stabilisation of the backbone movements and the potential ener-
gies. The representative structure of the complex trajectory was subjected to
PROCHECK [164] to analyse the quality of the model. There were only two
residues in the disallowed region of the Ramachandran plot (Figure 7.2),
and as both are located in the β-sheet bundle they do not contribute to the
active site. The substrate ASD was properly oriented above the heme for the
oxidation of C19 of ASD (Figure 7.3). In summary, the model showed sta-
ble behaviour both with the substrate and with an empty active site cavity.
The majority of the mutagenesis results were explained on the basis of our
model.
Figure 7.2 – The Ramachadran plot of the generated aromatase ho-mology model. For explanations of the markings, see Figure 5.5.
63
Chapter 7. Results and discussion
Figure 7.3 – The overall structure of the generated aromatasemodel. α-Helices are coloured red, β-sheets yellow and the coilsand loops green. The heme and ASD are represented as blue sticksand the heme iron is represented as a magenta sphere.
As described earlier, phytoestrogens are moderate to weak inhibitors of the
aromatase enzyme. We wished to find atomic level explanations for the dif-
ferent binding affinities observed for lignans, flavonoids, isoflavonoids and
coumestrol but with emphasis on the lignans because their binding mode
to aromatase had not been reported earlier. Experimental inhibition data
was available for some lignans, including natural compounds and their the-
oretical and verified metabolites. In the case of flavonoids, isoflavonoids
and coumestrol, experimental data was available but an atomic level ex-
planation for the binding affinity, and especially for the Fe–O coordination
geometry, was desired. Molecular dynamics simulations (MDS) were per-
formed on the complexes formed by aromatase and one of the two natu-
ral inhibitors enterolactone 15 (ENL) and α-naphthoflavone 13 (ANF) (Fig-
ure 3.10 on page 27). ENL and ANF were placed in the active site cavity and
oriented so that the coordination between the oxygen and the iron could
64
7.1. The aromatase model and the binding of phytoestrogens (I and II)
take place. A Morse potential was used to define the interaction between
the iron and the coordinating oxygen. MDSs with GROMACS were per-
formed for both complexes. In the case of ENL the system stabilised after
12 ns (total simulation time was 15 ns), and with ANF the stabilisation took
17 ns (total simulation time 20 ns). The stable parts of the trajectories were
clustered and the representative structures of the largest clusters were ex-
tracted and minimised in vacuo. All-atom models of the protein were gen-
erated for the docking studies, which were done with the GOLD program
[166] because it was able to reproduce the binding conformations of ligands
in four complexes (PDB entries 1SUO, 1Z11, 1W0G and 1J0C). The modified
metal binding parameters [184] of GOLD were not included. As a result, the
coordination orientation and residues providing hydrogen bonding coun-
terparts and lipophilic areas affecting the binding of the compounds under
study were identified. Since the coordination is the single most important
contribution to the binding strength, the geometry of the coordination was
analysed with lignans and flavonoids. The dockings suggested that the tilt-
ing angle between the lignanolactone carbonyl plane and the heme plane
was 40◦, representing a 50◦ deviation from the optimal 90◦. The distance
from the heme iron to the oxygen was 0.24 nm, or 0.05 nm more than the
average 0.19 nm (based on the database search described in publication II).
The Fe–O=C angle was 129◦, showing a 9◦ deviation from the optimal 120◦.
These deviations from the optimal values, especially the tilting angle and
the distance, probably lead to weaker coordination and thereby, to weaker
binding.
When ring A of the lignan skeleton (Figure 2.1 on page 5) contains a C3
hydroxy group, a hydrogen bond is formed to the backbone carbonyl of
T310. When the hydroxy group is at C4, the hydrogen bonding counterpart
is the backbone carbonyl of P481. The hydrogen bonds from ring B have
their counterparts in E302 or I305. A hydrophobic area at SRS-5 (Figure 3.3
on page 12) accommodates the hydrophobic skeleton, especially in the case
of NDGA. In the case of lignanodiols, that is, lignans lacking the lactone
65
Chapter 7. Results and discussion
Figure 7.4 – Above left: The binding mode of ENL (green). Aboveright: The binding mode of ANF (orange). Below left: The coordi-nation geometry between C4 carbonyl and the heme iron for fla-vone (cyan) and isoflavone (blue). Below right: The binding modeof coumestrol (yellow). Hydrogen bonds are displayed as dashedred lines
structure, the coordination takes place from either the phenolic or the ali-
phatic hydroxy groups. Hydrogen binding sites are the backbone carbonyls
of P481 and I305. Interestingly, enterodiol 16(END) (Figure 3.10 on page 27)
does not form any hydrogen bonds to the active site residues but forms an
intramolecular hydrogen bond between the phenolic hydroxy groups. The
binding orientations of ENL and ANF are displayed in Figure 7.4.
The binding mode of the flavone/isoflavone compounds, as proposed ear-
lier [185, 71], suggests that the C4 carbonyl coordinates with the heme iron.
These earlier studies relied on homology models of aromatase with an ac-
66
7.1. The aromatase model and the binding of phytoestrogens (I and II)
tive site cavity not refined for the flavonoid skeleton. In this work, MDSs
were performed for an ANF–aromatase complex to find the optimal active
site conformation for the flavonoid/isoflavonoid binding, and these com-
pounds were then docked to the representative structure from the MDS
study. The dockings showed the distance from the C4 carbonyl oxygen to
the heme iron to be 2.0 A for flavone and 2.2 A for isoflavone. The angle Fe–
O=C was found to be 136◦ for flavone and 160◦ for isoflavone (the optimal
angle is 120◦), while the corresponding tilting angle between the carbonyl
plane and the heme plane were 73◦ for flavone and 70◦ for isoflavone. The
coordination geometries of flavone and isoflavone show that the coordina-
tion between flavone and heme is closer to the optimal values and, therefore,
leads to better binding (Figure 7.4).According to my model, the positive in-
fluence of a C7 hydroxy group in the flavone skeleton (see Section 3.5 on
page 23) originates from the hydrogen bond to S478. A hydroxy group at
C6 or C8 is slightly shifted compared to one at C7, and does not support a
hydrogen bond. The C4’ hydroxy substitution leads to the unfavourable in-
teraction between a polar hydroxy group and the hydrophobic environment
formed by residues I132, F148 and C299, and thereby, to decreased activity.
Wang et al. [93] found coumestrol 19 (Figure 3.10 on page 27) to be a better
inhibitor of aromatase than any of the lignans they measured in the same
study. My model suggests that the good inhibitory activity is the result of a
relatively good coordination geometry between the carbonyl group and the
heme iron, including a tilting angle of 67◦, a distance of 1.8 nm between the
iron and the oxygen, and an Fe–O=C angle of 139◦. Additionally, two hydro-
gen bonds between coumestrol and S478 and C299 (Figure 7.4), were found
to anchor coumestrol in place. Mutants S478T and H480Q decreased the
binding affinity for coumestrol.[75] Mutation of the serine at 478 to a thre-
onine probably alters the position and orientation of the hydrogen bonding
counterpart with coumestrol and leads to the loss of the hydrogen bond.
H480 is located near the aromatic skeleton of coumestrol, and the mutation
to a glutamine probably leads to the loss of π – π interactions and a decrease
67
Chapter 7. Results and discussion
of inhibition potency.
Figure 7.5 – Letrozole 21 docked into the active site of aromatase.Letrozole is coloured lemon, and the heme iron is represented asa magenta sphere. For clarity, only selected residues are displayedand hydrogens are omitted.
In further validation of the homology model, MDS was performed for a
letrozole–aromatase complex (unpublished results). Letrozole 21 (Figure 3.11
on page 29), was docked into the active site and MDS was performed. The
simulation showed stable behaviour in terms of potential energy and the
backbone rmsd. The representative structure of the MD trajectory was ex-
tracted and letrozole was docked into the active site. The resulting orienta-
tion is shown in Figure 7.5. Mutagenesis studies [44, 62] have shown that
mutants E302D, I133Y, S478T, S478A, H480Q and H480K are more resistant
to letrozole than the wild-type enzyme. According to my model, the conser-
vative mutation E302D leads to the loss of the hydrogen bond between E302
and Q298, probably affecting the active site shape or volume. I133Y prob-
ably results in decrease in the active site cavity volume, S478T and S478A
change the position of a possible hydrogen bonding counterpart, and H480K
68
7.2. Molecular modelling of 17β-HSD1 and its inhibitors (III and IV)
and H480Q result in the loss of aromatic interactions between letrozole and
H480. The mutant D309A showed highly increased resistance to the in-
hibitor. This is hard to explain with my model, as are the other results with
D309 mutants (see publication I). The position of D309 is well established: it
is located in the I-helix near the very important T310. In the model, letrozole
is not oriented correctly to form hydrogen bonds with D309, and probably
it cannot change the orientation to establish such interactions. Most likely,
then, the D309A mutation leads to conformational changes in the active site.
I474 lies further away from the active site than do S478 and H480 and mu-
tants I474Y, I474W, I474M and I474N showed increased affinity to letrozole
[44], suggesting that the active site is somehow modified by a mutation at
this position. It is difficult to draw further conclusions, however, without
studying the effect of the mutation by MDS or other comparable method.
7.2 Molecular modelling of 17β-HSD1
and its inhibitors (III and IV)
Another attractive target for blocking estrogen production, in addition to
aromatase, is the 17β-HSD1 enzyme function. A high-throughput screening
(HTS) study against the 17β-HSD1 enzyme led to the discovery of a lead
compound 29 (Figure 7.6) [186], and a set of inhibitors based on the tetra-
cyclic skeleton of this lead compound were synthesised in our laboratory
and tested for inhibitory activity.[187] To simulate the dynamic process of
ligand binding and find the structural factors affecting it, MDSs were per-
formed for two potent inhibitors. The crystal structure of 17β-HSD1 (PDB
entry 1FDT) [107] containing the reduced product E2 and the oxidised cofac-
tor NADP+ was first relaxed using minimisation and MDS. This structure
was because there is evidence that the active site of aldo-keto reductase en-
zymes adapts to the ligand during binding [188, 189] and that the cofactor
NADP+ binds before the substrate.[190]
69
Chapter 7. Results and discussion
the lead compound 29
S
O
OH
N
N
O
S
O
N
N
O
S
O
N
N
O
30
31
Figure 7.6 – Structures of the inhibitors 29 and 30 used in the 17β-HSD1 MDS studies, and the structure of the most potent inhibitor31.
First, the crystal structure was relaxed using MDS. During the simulation
of the crystal structure, the reduced product estradiol E2 was moved away
from the catalytic triad (Ser142, Tyr155 and Lys159) and from the cofactor
NADP+. This is probably due to the rejection of the product by the active
site and the relocation of Phe192, a residue located in a loop with two differ-
ent conformations in the crystal structure. Figure 7.7 shows the positions of
E2 in the crystal structure and after the relaxation.
As a means to finding the optimal interactions between the active site and
the inhibitors for MDS studies, the lead compound 29 and another potent
inhibitor 30 were docked into the active site of the relaxed enzyme using
GOLD [166]. Despite the similarity of the structures of the two inhibitors,
two different binding modes were observed (Figure 7.8). MDS studies of the
two inhibitor–enzyme complexes revealed that the orientation of the lead
compound 29 was more stable and it formed active site interactions that
were not observed in the original or relaxed crystal structure of 17β-HSD1.
70
7.2. Molecular modelling of 17β-HSD1 and its inhibitors (III and IV)
Figure 7.7 – The positions of E2 before (magenta) and after (cyan)the relaxation of the crystal structure 1FDT. NADP+ is colouredblue. For clarity, hydrogens are omitted and the active site residuesare displayed as thin sticks.
Figure 7.8 – The two different binding modes of the inhibitors usedin MDSs. The lead compound 29 is coloured yellow and the otherpotent inhibitor 30 is coloured green. For clarity, hydrogens areomitted and the active site residues are displayed as thin sticks.
71
Chapter 7. Results and discussion
Figure 7.9 – The aligned 17β-HSD1 inhibitors after docking. Theprotein surface is displayed as a transparent surface and a part ofNADP+ is shown in blue. For clarity, hydrogens are omitted.
Figure 7.10 – The correlation between the predicted and LOGITtransformed pIC50 values of the 17β-HSD1 inhibitors. The trainingset is marked with blue squares and the test set with red circles.
72
7.2. Molecular modelling of 17β-HSD1 and its inhibitors (III and IV)
In a next step, a library of tetracyclic pyrimidinones synthesised in our lab-
oratory [187] was docked, using Surflex-Dock [170], into the enzyme struc-
ture obtained from MDS with the lead compound 29. The compounds were
found to be well aligned (Figure 7.9), and the ranking from the docking was
in accordance with the experimental inhibition data. This encouraged the
construction of a 3D-QSAR model to predict the activities of compounds not
yet tested for biological activity. Since no IC50 values had been determined
for the inhibitors, but only inhibition percentages at two inhibitor concentra-
tions (0.1 µM and 1 µM), LOGIT transformation [48] was used to obtain the
IC50 values. These values were then employed as dependant variables dur-
ing the CoMFA model building. The more complex CoMSIA method was
tested, too, but the resulting model had no predictive power. The CoMFA
model, on the other hand, had good predictive power, verified with an ex-
ternal test set (Figure 7.10).
The visualisation of the CoMFA fields reveals areas where a certain type
of interaction is either preferred or not preferred. The superimposition of
the CoMFA fields with the enzyme structure from the MDS with the most
potent compound 31 (Figures 7.6 and 7.11) revealed a hydrophobic pocket
(green area) formed by residues V143, G144, L149, F192, M193 and F259.
This pocket accommodates the bulky aromatic thioether side chain of in-
hibitor 31. Red areas located next to residues N152 and R258 suggest that a
negatively charged group at this position would serve as a hydrogen bond-
ing counterpart for these residues, and the blue area near Y218 suggests a
positively charged group at this position.
In summary, MDS with the lead compound 29, identification of the plau-
sible binding modes of the inhibitors and generation of the CoMFA model
helped in understanding the requirements for an active inhibitor. In partic-
ular, the CoMFA fields superimposed with the enzyme structure revealed
the hydrophobic pocket formed during the MDS, and the predictive power
of the model can help in the design of even more active inhibitors of 17β-
HSD1.
73
Chapter 7. Results and discussion
Figure 7.11 – The CoMFA fields superimposed with the enzymestructure obtained from MDS with the lead compound 29. Themost potent inhibitor 31 is coloured orange and NADP+ blue.
Pharmacophore modelling is a new method in 17β-HSD1 drug discovery
projects. Four pharmacophore models were developed in this work, based
on A) the relaxed crystal structure of 17β-HSD1 from the QSAR study (see
publication III), B) our thienopyrimidinone library [187], C) the 17β-HSD1
crystal structure complex with E2 (PDB entry 1FDT [107]), and D) the docked
complex of our most potent inhibitor 31 and 17β-HSD1(see publication III).
Where a complex was analysed, the LigandScout program [138, 139] was
used, and where ligands alone or the receptor alone were analysed, the Dis-
covery Studio 2.1 software (DS) [168] was employed. With the data noted
above, four pharmacophore hypotheses were generated, differing from each
other in number or the nature of the pharmacophoric features. Figure 7.12
shows the four generated pharmacophores.
The Interaction Generation protocol in DS, performed on the relaxed crys-
tal structure from III, followed by clustering of the pharmacophoric fea-
tures and manual adjustment, produced five pharmacophoric features (A
74
7.2. Molecular modelling of 17β-HSD1 and its inhibitors (III and IV)
Figure 7.12 – The four generated 17β-HSD1 pharmacophoresbased on A) the relaxed crystal structure, B) the thienopyrimidi-none library, C) the crystal structure 1FDT and D) the complex ofour most potent inhibitor 31 and 17β-HSD1. For clarity, hydro-gens are omitted and in pharmacophore D only selected exclusionspheres are displayed.
in Figure 7.12). Two hydrogen bond donors were pointed towards S222 and
E282 and one hydrogen bond acceptor was pointed towards S142. Addi-
tionally, two hydrophobic centres were located in the centre of the active
site. The Common Feature Pharmacophore Generation protocol of DS was
used to find the most important factors contributing to tight binding in our
thienopyrimidinone inhibitor library (B in Figure 7.12). Three hydrogen
bond donor features were identified, two located at the carbonyl groups
and one at the pyrimidinone nitrogen. However, the donor feature at the
nitrogen was manually changed to a hydrophobic feature after careful in-
spection of the active site residues showed that there are no residues in this
area capable of functioning as hydrogen bond donors. The interactions in
75
Chapter 7. Results and discussion
the complex crystal structure 1FDT [107] were identified with the Ligand-
Scout program. An automated pharmacophore generation identified two
hydrogen bond acceptors, one pointing from the C3 hydroxyl towards Y155
and one from the C17 hydroxyl towards H221, and two hydrophobic fea-
tures, one located at C18 and one at the aromatic ring A of E2 (C in Figure
7.12). LigandScout was also used for the pharmacophore generation of the
docked complex of our most potent inhibitor 31 (Figure 7.6) and 17β-HSD1
after MDS. The program generated a hydrogen bond from the thiophenyl
side chain hydroxyl towards the backbone carbonyl of G144, two hydrogen
bonds from the two carbonyls in the compound towards Y218 and R258,
and one hydrophobic feature around the thiophenyl substituent (D in Fig-
ure 7.12).
The Maybridge HitFinder database [191], consisting of 14,400 drug-like com-
pounds, was used in the database searches. To enlarge the conformational
space of the compounds in the database, the Diverse Conformation Gen-
eration protocol in DS was used to generate a maximum of 10 conformers
for each compound. The Ligand Pharmacophore Mapping protocol was
used to rank the conformers from the database on the basis of their fit to the
pharmacophores, and the best ranking conformer of each compound was
accepted or rejected on the basis of their scaled FitValues (from 0 to 1). Com-
pounds having a scaled FitValue of ≥ 0.5 were accepted. These compounds
were then docked to the 17β-HSD1 structure using GOLD Suite [166] and
the default scoring function Goldscore. Additionally, the docked solutions
were rescored with the Chemscore scoring function. The above-mentioned
methods (FitValue, Goldscore and Chemscore) were used to acquire a con-
sensus score for each of the compounds, and the compounds shown in
Figure 7.13 were identified. Additionally, the compounds identified for
pharmacophores originating from our inhibitors (pharmacophores B and D)
were evaluated with the developed QSAR model (see publication III). All
compounds were predicted to have inhibitory activity at sub-micromolar
levels. Note that the compounds in Figure 7.13 should be treated as poten-
76
7.2. Molecular modelling of 17β-HSD1 and its inhibitors (III and IV)
tial lead candidates because they possess functional groups that are usually
unfavoured in drug-like compounds.
In summary, four pharmacophores generated by different methods were
used to identify the contributions to the binding affinity to the 17β-HSD1
enzyme. The pharmacophores were used as 3D database search queries,
and novel scaffolds for potential 17β-HSD1 inhibitors were identified.
NS
N HNHN
O
F N
N
SHN
ONH
SO
O
F F
NS
N
NO NH2
O
ON
S
NN
NCl
Pharmacophore B
S
NS
NCF3
O
SO
O
HN
N N
CF3
Pharmacophore C
O
O
NH
O
NNS
F S
OCF3
CF3
S
NN
NF3C
Pharmacophore D
HON
NO2
O
CF3
ON
H2NN N
NHS
NO
CF3
Pharmacophore A
Figure 7.13 – Compounds identified from the MaybridgeHitFinder database using the generated pharmacophores and con-sensus ranking.
77
Chapter 8
Conclusions
In this study of the structure and interactions of the estrogen-producing en-
zymes CYP450 aromatase and 17β-HSD1, I have shown that, despite the low
sequence identity, a stable aromatase model that explains most of the results
from mutagenesis studies can be built, using a mammalian CYP450 crystal
structure as template. Further, this model explains the differences in the
binding affinity of phytoestrogenic compounds. State-of-the-art software
was used in the process, and molecular dynamics simulations in particular,
proved to be an excellent tool in model validation and study of the bind-
ing modes of ligands with aromatase . The same was shown for 17β-HSD1.
QSAR in combination with sophisticated docking procedures were used to
generate a predictive model for 17β-HSD1 inhibitors having a tetracyclic
thienopyrimidinone scaffold. Finally, 17β-HSD1 pharmacophore genera-
tion followed by virtual screening produced several candidates for future
drug design projects.
The results of this study can be utilised in designing new inhibitors for both
of the enzymes. The predicted structure of aromatase can be applied in
de novo design of aromatase inhibitors and study of their interactions with
the enzyme, and the QSAR model of 17β-HSD1 can be used to predict the
activities of newly designed inhibitor candidates prior to their synthesis.
79
Chapter 8. Conclusions
The pharmacophore models can be used in searching any 3D database for
lead candidates.
Altogether, this work shows the power of molecular modelling in modern
macromolecular, as well as small molecule, research projects. It comple-
ments traditional wet chemistry and contributes in a valuable way to multi-
disciplinary research projects.
The primary outcomes of this study concerning drug discovery are: A) a
letrozole-validated model of the aromatase enzyme and B) new pharma-
cophores for virtual screening of 17β-HSD1 inhibitors.
80
Bibliography
[1] American Cancer Society, Breast Cancer Fact and Figures 2007-2008, Atlanta: Amer-
ican Cancer Society, Inc.
[2] Russo J, Hasan Lareef M, Balogh G, Guo S, Russo IH. Estrogen and its metabolites
are carcinogenic agents in human breast epithelial cells. J. Steroid Biochem. Mol. Biol.
87, 1–25, 2003.
[3] Nelson LR, Bulun SE. Estrogen production and action. J. Am. Acad. Dermatol. 45,
S116–S124, 2001.
[4] Santen R. To block estrogen’s synthesis or action: That is the question. J. Clin. En-
docrinol. Metab. 87, 3007–3012, 2002.
[5] de Ziegler D, Mattenberger C, Luyet C, Romoscanu I, Irion NF, Bianchi-Demicheli F.
Clinical use of aromatase inhibitors (AI) in premenopausal women. J. Steroid Biochem.
Mol. Biol. 95, 121–127, 2005.
[6] Ackerman GE, Carr BR. Estrogens. Rev. Endocr. Metab. Disord. 3, 225–230, 2002.
[7] Nussey S, Whitehead S. Endocrinology: An Integrated Approach. Taylor & Francis,
Mortimer House, 37-41 Mortimer Street, London, W1T 3JH, UK, 2001.
[8] Fluck CE, Miller WL, Auchus RJ. The 17, 20-lyase activity of cytochrome P450c17
from human fetal testis favors the δ5 steroidogenic pathway. J. Clin. Endocrinol. Metab.
88, 3762–3766, 2003.
[9] Labrie F, Luu-The V, Lin SX, Simard J, Labrie C. Role of 17β-hydroxysteroid dehy-
drogenases in sex steroid formation in peripheral intracrine tissues. Trends Endocrinol.
Metab. 11, 421–427, 2000.
[10] Subramanian A, Salhab M, Mokbel K. Oestrogen producing enzymes and mammary
carcinogenesis: A review. Breast Cancer Res. Treat. 111, 191–202, 2008.
[11] Ghosh D. Human sulfatases: A structural perspective to catalysis. Cell. Mol. Life Sci.
64, 2013–2022, 2007.
81
Bibliography
[12] Fu XD, Simoncini T. Extra-nuclear signaling of estrogen receptors. IUBMB Life 60,
502–510, 2008.
[13] Williams PA, Cosme J, Sridhar V, Johnson EF, McRee DE. Mammalian microsomal
cytochrome P450 monooxygenase: Structural adaptations for membrane binding and
functional diversity. Mol. Cell 5, 121–131, 2000.
[14] Guengerich FP. Cytochrome P450: What have we learned and what are the future
issues? Drug Metab. Rev. 36, 159–197, 2004.
[15] Peterson JA, Graham SE. A close family resemblance: The importance of structure in
understanding cytochromes P450. Structure 6, 1079–1085, 1998.
[16] Auclair K, Moenne-Loccoz P, Ortiz de Montellano P. Roles of the proximal heme
thiolate ligand in cytochrome P450cam. J. Am. Chem. Soc. 123, 4877–4885, 2001.
[17] Meunier B, deVisser S, Shaik S. Mechanism of oxidation reactions catalyzed by cy-
tochrome P450 enzymes. Chem. Rev. 104, 3947–3980, 2004.
[18] Schlichting I, Berendzen J, Chu K, Stock AM, Maves SA, Benson DE, Sweet RM, Ringe
D, Petsko GA, Sligar SG. The catalytic pathway of cytochrome P450cam at atomic
resolution. Science 287, 1615–1622, 2000.
[19] Bulun SE, Lin Z, Imir G, Amin S, Demura M, Yilmaz B, Martin R, Utsunomiya H,
Thung S, Gurates B, Tamura M, Langoi D, Deb S. Regulation of aromatase expres-
sion in estrogen-responsive breast and uterine disease: From bench to treatment.
Pharmacol. Rev. 57, 359–383, 2005.
[20] Simpson E, Mahendroo M, Means G, Kilgore M, Hinshelwood M, Graham-Lorence
S, Amarneh B, Ito Y, Fisher C, Michael M. Aromatase cytochrome P450, the enzyme
responsible for estrogen biosynthesis. Endocr. Rev. 15, 342–355, 1994.
[21] Miller WR, Mullen P, Sourdaine P, Watson C, Dixon JM, Telford J. Regulation of
aromatase activity within the breast. J. Steroid Biochem. Mol. Biol. 61, 193–202, 1997.
[22] Kagawa N, Hori H, Waterman MR, Yoshioka S. Characterization of stable human
aromatase expressed in E. coli. Steroids 69, 235–243, 2004.
[23] Corbin CJ, Mapes SM, Lee YM, Conley AJ. Structural and functional differences
among purified recombinant mammalian aromatases: Glycosylation, N-terminal se-
quence and kinetic analysis of human, bovine and the porcine placental and gonadal
isozymes. Mol. Cell. Endocrinol. 206, 147–157, 2003.
[24] Zhang F, Zhou D, Kao YC, Ye J, Chen S. Expression and purification of a recombinant
form of human aromatase from Escherichia coli. Biochem. Pharmacol. 64, 1317–1324,
2002.
82
Bibliography
[25] Amarneh B, Simpson ER. Expression of a recombinant derivative of human aro-
matase P450 in insect cells utilizing the baculovirus vector system. Mol. Cell. En-
docrinol. 109, R1–R5, 1995.
[26] Sigle RO, Titus MA, Harada N, Nelson SD. Baculovirus mediated high level expres-
sion of human placental aromatase (CYP19A1). Biochem. Biophys. Res. Commun. 201,
694–700, 1994.
[27] Gotoh O. Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins
inferred from comparative analyses of amino acid and coding nucleotide sequences.
J. Biol. Chem. 267, 83–90, 1992.
[28] Johnson EF, Stout CD. Structural diversity of human xenobiotic-metabolizing cy-
tochrome P450 monooxygenases. Biochem. Biophys. Res. Commun. 338, 331–336, 2005.
[29] Kellis JT J, Vickery L. Purification and characterization of human placental aromatase
cytochrome P-450. J. Biol. Chem. 262, 4413–4420, 1987.
[30] Groves JT, Watanabe Y. Reactive iron porphyrin derivatives related to the catalytic
cycles of cytochrome P-450 and peroxidase. Studies of the mechanism of oxygen ac-
tivation. J. Am. Chem. Soc. 110, 8443–8452, 1988.
[31] Hackett J, Brueggemeier R, Hadad C. The final catalytic step of cytochrome P450
aromatase: A density functional theory study. J. Am. Chem. Soc. 127, 5224–5237, 2005.
[32] Loge C, Borgne ML, Marchand P, Robert JM, Baut GL, Palzer M, Hartmann RW.
Three-dimensional model of cytochrome P450 human aromatase. J. Enzym. Inhib.
Med. Chem. 20, 581–585, 2005.
[33] Poulos T, Finzel B, Gunsalus I, Wagner G, Kraut J. The 2.6 A crystal structure of
Pseudomonas putida cytochrome P-450. J. Biol. Chem. 260, 16122–16130, 1985.
[34] Graham-Lorence S, Khalil M, Lorence M, Mendelson C, Simpson E. Structure-
function relationships of human aromatase cytochrome P-450 using molecular mod-
eling and site-directed mutagenesis. J. Biol. Chem. 266, 11939–11946, 1991.
[35] Zhou D, Korzekwa K, Poulos T, Chen S. A site-directed mutagenesis study of human
placental aromatase. J. Biol. Chem. 267, 762–768, 1992.
[36] Laughton CA, Zvelebil MJ, Neidle S. A detailed molecular model for human aro-
matase. J. Steroid Biochem. Mol. Biol. 44, 399–407, 1993.
[37] Barton GJ, Sternberg MJE. A strategy for the rapid multiple alignment of protein
sequences : Confidence levels from tertiary structure comparisons. J. Mol. Biol. 198,
327–337, 1987.
83
Bibliography
[38] Zvelebil MJ, Barton GJ, Taylor WR, Sternberg MJE. Prediction of protein secondary
structure and active sites using the alignment of homologous sequences. J. Mol. Biol.
195, 957–961, 1987.
[39] QUANTA, Accelrys Inc., 10188 Telesis Court, Suite 100 San Diego, CA 92121, USA.
[40] Ravichandran K, Boddupalli S, Hasermann C, Peterson J, Deisenhofer J. Crystal
structure of hemoprotein domain of P450bm-3, a prototype for microsomal P450’s.
Science 261, 731–736, 1993.
[41] Hasemann CA, Ravichandran KG, Peterson JA, Deisenhofer J. Crystal structure and
refinement of cytochrome P450terp at 2.3 A resolution. J. Mol. Biol. 236, 1169–1185,
1994.
[42] Graham-Lorence S, Amarneh B, White RE, Peterson JA, Simpson ER. A three-
dimensional model of aromatase cytochrome P450. Protein Sci. 4, 1065–1080, 1995.
[43] Koymans MH L M, Vanden Bossche H. A molecular model for the interaction be-
tween vorozole and other non-steroidal inhibitors and human cytochrome P450 19
(P450 aromatase). J. Steroid Biochem. Mol. Biol. 53, 191–197, 1995.
[44] Kao YC, Cam LL, Laughton CA, Zhou D, Chen S. Binding characteristics of seven
inhibitors of human aromatase: A site-directed mutagenesis study. Cancer Res. 56,
3451–3460, 1996.
[45] Nelson DR. Cytochrome P450 nomenclature and alignment of selected sequences. In
Cytochrome P450: Structure, Mechanism and Biochemistry, P Ortiz de Montellano, ed.,
pp. 575–606. Plenum Press: New York, 2 edn., 1995.
[46] Laughton CA. Prediction of protein side-chain conformations from local three-
dimensional homology relationships. J. Mol. Biol. 235, 1088–1097, 1994.
[47] Lewis D, Lee-Robichaud P. Molecular modelling of steroidogenic cytochromes P450
from families CYP11, CYP17, CYP19 and CYP21 based on the CYP102 crystal struc-
ture. J. Steroid Biochem. Mol. Biol. 66, 217–233, 1998.
[48] Sybyl 7.3, Tripos Inc., 1699 South Hanley Rd., St. Louis, MO, USA.
[49] Cavalli A, Greco G, Novellino E, Recanatini M. Linking CoMFA and protein homol-
ogy models of enzyme-inhibitor interactions: An application to non-steroidal aro-
matase inhibitors. Bioorg. Med. Chem. 8, 2771–2780, 2000.
[50] Jones DT. Protein secondary structure prediction based on position-specific scoring
matrices. J. Mol. Biol. 292, 195–202, 1999.
[51] Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT. Protein structure
prediction servers at University College London. Nucleic Acids Res. 33, W36–38, 2005.
84
Bibliography
[52] Auvray P, Nativelle C, Bureau R, Dallemagne P, Seralini GE, Sourdaine P. Study
of substrate specificity of human aromatase by site directed mutagenesis. Eur. J.
Biochem. 269, 1393–1405, 2002.
[53] Chen S, Zhang F, Sherman MA, Kijima I, Cho M, Yuan YC, Toma Y, Osawa Y, Zhou
D, Eng ET. Structure-function studies of aromatase and its inhibitors: A progress
report. J. Steroid Biochem. Mol. Biol. 86, 231–237, 2003.
[54] Wester M, Johnson E, Marques-Soares C, Dansette P, Mansuy D, Stout C. Structure
of a substrate complex of mammalian cytochrome P450 2C5 at 2.3 A resolution: Evi-
dence for multiple substrate binding modes. Biochemistry 42, 6370–6379, 2003.
[55] Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: Improving the sensitivity
of progressive multiple sequence alignment through sequence weighting, position-
specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680, 1994.
[56] Petrey D, Xiang Z, Tang CL, Xie L, Gimpelev M, Mitros T, Soto CS, Goldsmith-
Fischman S, Kernytsky A, Schlessinger A, Koh IY, Alexov E, Honig B. Using multiple
structure alignments, fast model building, and energetic analysis in fold recognition
and homology modeling. Proteins 53, 430–435, 2003.
[57] Xiang Z, Honig B. Extending the accuracy limits of prediction for side-chain confor-
mations. J. Mol. Biol. 311, 421–430, 2001.
[58] Jones G, Willet P, Glen RC, Leach AR, Taylor R. Development and validation of a
genetic algorithm for flexible docking. J. Mol. Biol. 267, 727–748, 1997.
[59] Ahmed S. Review of the molecular modelling studies of the cytochrome P-450 estro-
gen synthetase enzyme, aromatase. Drug. Des. Discov. 15, 239–252, 1998.
[60] Furet P, Batzl C, Bhatnagar A, Francotte E, Rihs G, Lang M. Aromatase inhibitors:
Synthesis, biological activity, and binding mode of azole-type compounds. J. Med.
Chem. 36, 1393–1400, 1993.
[61] Kellis J JT, Childers W, Robinson C, Vickery L. Inhibition of aromatase cytochrome
P-450 by 10-oxirane and 10-thiirane substituted androgens. Implications for the struc-
ture of the active site. J. Biol. Chem. 262, 4421–4426, 1987.
[62] Kao YC, Korzekwa KR, Laughton CA, Chen S. Evaluation of the mechanism of aro-
matase cytochrome P450. A site-directed mutagenesis study. Eur. J. Biochem. 268,
243–251, 2001.
[63] Williams PA, Cosme J, Ward A, Angove HC, Matak Vinkovic D, Jhoti H. Crystal
structure of human cytochrome P450 2C9 with bound warfarin. Nature 424, 464–468,
2003.
85
Bibliography
[64] Favia AD, Cavalli A, Masetti M, Carotti A, Recanatini M. Three-dimensional model
of the human aromatase enzyme and density functional parameterization of the iron-
containing protoporphyrin IX for a molecular dynamics study of heme-cysteinato
cytochromes. Proteins 62, 1074–1087, 2006.
[65] Huang X, Miller W. A time-efficient, linear-space local similarity algorithm. Adv.
Appl. Math. 12, 337–357, 1991.
[66] Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial re-
straints. J. Mol. Biol. 234, 779–815, 1993.
[67] Case D, Darden T, Cheatham T III, Simmerling C, Wang J, Duke R, Luo R, Merz K,
Wang B, Pearlman D, Crowley M, Brozell S, Tsui V, Gohlke H, Mongan J, Hornak V,
Cui G, Beroza P, Schafmeister C, Caldwell J, Ross W, Kollman P. AMBER 8. University
of California, San Francisco, 2004.
[68] Murthy JN, Nagaraju M, Sastry GM, Rao AR, Sastry GN. Active site acidic residues
and structural analysis of modelled human aromatase: A potential drug target for
breast cancer. J. Comput.-Aided Mol. Des. 19, 857–870, 2005.
[69] Amarneh B, Corbin C, Peterson J, Simpson E, Graham-Lorence S. Functional do-
mains of human aromatase cytochrome P450 characterized by linear alignment and
site-directed mutagenesis. Mol. Endocrinol. 7, 1617–1624, 1993.
[70] Chen S, Zhou D. Functional domains of aromatase cytochrome P450 inferred from
comparative analyses of amino acid sequences and substantiated by site-directed
mutagenesis experiments [published erratum appears in J Biol Chem 1994 Jan
14;269(2):1564]. J. Biol. Chem. 267, 22587–22594, 1992.
[71] Kao YC, Zhou C, Sherman M, Laughton CA, Chen S. Molecular basis of the inhibition
of human aromatase (estrogen synthetase) by flavone and isoflavone phytoestrogens:
A site-directed mutagenesis study. Environ. Health Perspect. 106, 85–92, 1998.
[72] Zhou D, Cam L, Laughton C, Korzekwa K, Chen S. Mutagenesis study at a postulated
hydrophobic region near the active site of aromatase cytochrome P450. J. Biol. Chem.
269, 19501–19508, 1994.
[73] Zhou D, Pompon D, Chen S. Structure-function studies of human aromatase by site-
directed mutagenesis: Kinetic properties of mutants Pro-308 → Phe, Tyr-361 → Phe,
Try-361 → Leu, and Phe-406 → Arg. Proc. Natl. Acad. Sci. U. S. A. 88, 410–414, 1991.
[74] Kadohama N, Zhou D, Chen S, Osawa Y. Catalytic efficiency of expressed aromatase
following site-directed mutagenesis. Biochim. Biophys. Acta 1163, 195–200, 1993.
86
Bibliography
[75] Hong Y, Cho M, Yuan YC, Chen S. Molecular basis for the interaction of four different
classes of substrates and inhibitors with human aromatase. Biochem. Pharmacol. 75,
1161–1169, 2008.
[76] Kadohama N, Yarborough C, Zhou D, Chen S, Osawa Y. Kinetic properties of aro-
matase mutants Pro308Phe, Asp309Asn, and Asp309Ala and their interactions with
aromatase inhibitors. J. Steroid Biochem. Mol. Biol. 43, 693–701, 1992.
[77] Conley A, Mapes S, Corbin CJ, Greger D, Graham S. Structural determinants of
aromatase cytochrome P450 inhibition in substrate recognition site-1. Mol. Endocrinol.
16, 1456–1468, 2002.
[78] Cupp-Vickery JR, Poulos TL. Structure of cytochrome P450eryf involved in ery-
thromycin biosynthesis. Nat. Struct. Mol. Biol. 2, 144–153, 1995.
[79] Adlercreutz H. Phytoestrogens: Epidemiology and a possible role in cancer protec-
tion. Environ. Health Perspect. 103, 103–112, 1995.
[80] Mazur WM, Uehara M, Wahala K, Adlercreutz H. Phyto-oestrogen content of berries,
and plasma concentrations and urinary excretion of enterolactone after a single
strawberry-meal in human subjects. Br. J. Nutr. 83, 381–387, 2000.
[81] Milder I, Arts I, Venema D, Lasaroms J, Wahala K, Hollman P. Optimization of
a liquid chromatography-tandem mass spectrometry method for quantification of
the plant lignans secoisolariciresinol, matairesinol, lariciresinol, and pinoresinol in
foods. J. Agric. Food Chem. 52, 4643–4651, 2004.
[82] Milder IE, Feskens EJ, Arts IC, de Mesquita HB, Hollman PC, Kromhout D. Intake
of the plant lignans secoisolariciresinol, matairesinol, lariciresinol, and pinoresinol in
dutch men and women. J. Nutr. 135, 1202–1207, 2005.
[83] Kiely M, Faughnan M, Wahala K, Brants H, Mulligan A. Phyto-oestrogen levels in
foods: The design and construction of the VENUS database. Br. J. Nutr. 89, Suppl 1,
S19–23, 2003.
[84] Lof M, Weiderpass E. Epidemiologic evidence suggests that dietary phytoestrogen
intake is associated with reduced risk of breast, endometrial, and prostate cancers.
Nutr. Res. 26, 609–619, 2006.
[85] Karkola S, Lilienkampf A, Wahala K. Phytoestrogens in drug discovery for control-
ling steroid biosynthesis. In Recent Advances in Polyphenol Research, vol. 1, F Daayf,
V Lattanzio, eds., pp. 293–316. Blackwell Publishing, 2008.
[86] Kellis Jr JT, Vickery LE. Inhibition of human estrogen synthetase (aromatase) by
flavones. Science 225, 1032–1034, 1984.
87
Bibliography
[87] Kellis Jr JT, Nesnow S, Vickery LE. Inhibition of aromatase cytochrome P-450 (estro-
gen synthetase) by derivatives of α-naphthoflavone. Biochem. Pharmacol. 35, 2887–
2891, 1986.
[88] Sanderson JT, Hordijk J, Denison MS, Springsteel MF, Nantz MH, van den Berg M.
Induction and inhibition of aromatase (CYP19) activity by natural and synthetic fla-
vonoid compounds in H295R human adrenocortical carcinoma cells. Toxicol. Sci. 82,
70–79, 2004.
[89] Rupp B, Raub S, Marian C, Holtje HD. Molecular design of two sterol 14α-
demethylase homology models and their interactions with the azole antifungals ke-
toconazole and bifonazole. J. Comput.-Aided Mol. Des. 19, 149–163, 2005.
[90] Ibrahim AR, Abul-Hajj YJ. Aromatase inhibition by flavonoids. J. Steroid Biochem.
Mol. Biol. 37, 257–260, 1990.
[91] Le Bail J, Laroche T, Marre-Fournier F, Habrioux G. Aromatase and 17β-
hydroxysteroid dehydrogenase inhibition by flavonoids. Cancer Lett. 133, 101–106,
1998.
[92] Adlercreutz H, Bannwart C, Wahala K, Makela T, Brunow G, Hase T, Arosemena PJ,
Kellis Jr JT, Vickery LE. Inhibition of human aromatase by mammalian lignans and
isoflavonoid phytoestrogens. J. Steroid Biochem. Mol. Biol. 44, 147–153, 1993.
[93] Wang C, Makela T, Hase T, Adlercreutz H, Kurzer MS. Lignans and flavonoids inhibit
aromatase enzyme in human preadipocytes. J. Steroid Biochem. Mol. Biol. 50, 205–212,
1994.
[94] Borriello SP, Setchell KD, Axelson M, Lawson AM. Production and metabolism of
lignans by the human faecal flora. J. Appl. Bacteriol. 58, 37–43, 1995.
[95] Clavel T, Henderson G, Engst W, Dore J, Blaut M. Phylogeny of human intestinal bac-
teria that activate the dietary lignan secoisolariciresinol diglucoside. FEMS Microbiol.
Ecol. 55, 471–478, 2006.
[96] Wang LQ, Meselhy MR, Li Y, Qin GW, Hattori M. Human intestinal bacteria capable
of transforming secoisolariciresinol diglucoside to mammalian lignans, enterodiol
and enterolactone. Chem. Pharm. Bull. 48, 1606–1610, 2000.
[97] Raffaelli B, Leppala E, Chappuis C, Wahala K. New potential mammalian lignan
metabolites of environmental phytoestrogens. Environ. Chem. Lett. 4, 1–9, 2006.
[98] Bickoff EM, Booth AN, Lyman RL, Livingston AL, Thompson CR, Deeds F. Coume-
strol, a new estrogen isolated from forage crops. Science 126, 969–970, 1957.
88
Bibliography
[99] Price K, Fenwick G. Naturally occurring estrogens in foods. Food Addit. Contam. 2,
73–106, 1985.
[100] Miller WR, Bartlett J, Brodie AMH, Brueggemeier RW, di Salle E, Lonning PE, Llom-
bart A, Maass N, Maudelonde T, Sasano H, Goss PE. Aromatase inhibitors: Are
there differences between steroidal and nonsteroidal aromatase inhibitors and do
they matter? Oncologist 13, 829–837, 2008.
[101] American Cancer Society, Cancer Facts & Figures 2008, Atlanta: American Cancer
Society.
[102] Lukacik P, Kavanagh KL, Oppermann U. Structure and function of human 17β-
hydroxysteroid dehydrogenases. Mol. Cell. Endocrinol. 248, 61–71, 2006.
[103] Mindnich R, Mller G, Adamski J. The role of 17β-hydroxysteroid dehydrogenases.
Mol. Cell. Endocrinol. 218, 7–20, 2004.
[104] Moeller G, Adamski J. Integrated view on 17β-hydroxysteroid dehydrogenases. Mol.
Cell. Endocrinol. p. doi:10.1016/j.mce.2008.10.040, In press.
[105] Zhu DW, Lee X, Breton R, Ghosh D, Pangborn W, Duax WL, Lin SX. Crystallization
and preliminary X-ray diffraction analysis of the complex of human placental 17β-
hydroxysteroid dehydrogenase with NADP+. J. Mol. Biol. 234, 242–244, 1993.
[106] Ghosh D, Pletnev VZ, Zhu DW, Wawrzak Z, Duax WL, Pangborn W, Labrie F, Lin
SX. Structure of human estrogenic 17β-hydroxysteroid dehydrogenase at 2.20 A res-
olution. Structure 3, 503–513, 1995.
[107] Breton R, Housset D, Mazza C, Fontecilla-Camps JC. The structure of a complex
of human 17β-hydroxysteroid dehydrogenase with estradiol and NADP+ identifies
two principal targets for the design of inhibitors. Structure 4, 905–915, 1996.
[108] Azzi A, Rehse PH, Zhu DW, Campbell RL, Labrie F, Lin SX. Crystal structure of hu-
man estrogenic 17β-hydroxysteroid dehydrogenase complexed with 17β-estradiol.
Nat. Struct. Mol. Biol. 3, 665–668, 1996.
[109] Sawicki MW, Erman M, Puranen T, Vihko P, Ghosh D. Structure of the ternary
complex of human 17β-hydroxysteroid dehydrogenase type 1 with 3-hydroxyestra-
1,3,5,7-tetraen-17-one (equilin) and NADP+. Proc. Natl. Acad. Sci. 96, 840–845, 1999.
[110] Han Q, Campbell RL, Gangloff A, Huang YW, Lin SX. Dehydroepiandrosterone and
dihydrotestosterone recognition by human estrogenic 17β -hydroxysteroid dehydro-
genase. C-18/C-19 steroid discrimination and enzyme-induced strain. J. Biol. Chem.
275, 1105–1111, 2000.
89
Bibliography
[111] Gangloff A, Shi R, Nahoum V, Lin SX. Pseudo-symmetry of C19 steroids, alternative
binding orientations, and multispecificity in human estrogenic 17β-hydroxysteroid
dehydrogenase. FASEB J. 17, 274–276, 2003.
[112] Shi R, Lin SX. Cofactor hydrogen bonding onto the protein main chain is conserved
in the short chain dehydrogenase/reductase family and contributes to nicotinamide
orientation. J. Biol. Chem. 279, 16778–16785, 2004.
[113] Qiu W, Campbell RL, Gangloff A, Dupuis P, Boivin RP, Tremblay MR, Poirier D,
Lin SX. A concerted, rational design of type 1 17β-hydroxysteroid dehydrogenase
inhibitors: Estradiol-adenosine hybrids with high affinity. FASEB J. 2002.
[114] Gunnarson C, Ahnstrom M, Kirschner K, Olsson B, Nordenskjold B, Rutqvist LE,
Skoog L, Stal O. Amplification of HSD17B1 and ERBB2 in primary breast cancer.
Oncogene 22, 34–40, 2003.
[115] Poutanen M, Isomaa V, Lehto VP, Vihko R. Immunological analysis of 17β-
hydroxysteroid dehydrogenase in benign and malignant human breast tissue. Int.
J. Cancer 50, 386–390, 1992.
[116] Miettinen MM, Poutanen MH, Vihko RK. Characterization of estrogen-dependent
growth of cultured MCF-7 human breast-cancer cells expressing 17β-hydroxysteroid
dehydrogenase type 1. Int. J. Cancer 68, 600–604, 1996.
[117] Sasano H, Frost A, Saitoh R, Harada N, Poutanen M, Vihko R, Bulun S, Silverberg
S, Nagura H. Aromatase and 17β-hydroxysteroid dehydrogenase type 1 in human
breast carcinoma. J. Clin. Endocrinol. Metab. 81, 4042–4046, 1996.
[118] Suzuki T, Moriya T, Ariga N, Kaneko C, Kanazawa M, Sasano H. 17β-hydroxysteroid
dehydrogenase type 1 and type 2 in human breast carcinoma: A correlation to clini-
copathological parameters. Br. J. Cancer 82, 518–523, 2000.
[119] Filling C, Nordling E, Benach J, Berndt KD, Ladenstein R, Jornvall H, Oppermann
U. Structural role of conserved Asn179 in the short-chain dehydrogenase/reductase
scaffold. Biochem. Biophys. Res. Commun. 289, 712–717, 2001.
[120] Filling C, Berndt KD, Benach J, Knapp S, Prozorovski T, Nordling E, Ladenstein R,
Jornvall H, Oppermann U. Critical residues for structure and catalysis in short-chain
dehydrogenases/reductases. J. Biol. Chem. 277, 25677–25684, 2002.
[121] Rao ST, Rossmann MG. Comparison of super-secondary structures in proteins. J.
Mol. Biol. 76, 241–250, 1973.
[122] Joernvall H, Persson B, Krook M, Atrian S, Gonzalez-Duarte R, Jeffery J, Ghosh D.
Short-chain dehydrogenases/reductases (SDR). Biochemistry 34, 6003–6013, 1995.
90
Bibliography
[123] Alho-Richmond S, Lilienkampf A, Wahala K. Active site analysis of 17β-
hydroxysteroid dehydrogenase type 1 enzyme complexes with SPROUT. Mol. Cell.
Endocrinol. 248, 208–213, 2006.
[124] Mazza C, Breton R, Housset D, Fontecilla-Camps JC. Unusual charge stabilization of
NADP+ in 17β-hydroxysteroid dehydrogenase. J. Biol. Chem. 273, 8145–8152, 1998.
[125] Lawrence H, Vicker N, Allan G, Smith A, Mahon M, Tutill H, Purohit A, Reed M,
Potter B. Novel and potent 17β-hydroxysteroid dehydrogenase type 1 inhibitors. J.
Med. Chem. 48, 2759–2762, 2005.
[126] Fischer DS, Allan GM, Bubert C, Vicker N, Smith A, Tutill HJ, Purohit A, Wood L,
Packham G, Mahon MF, Reed MJ, Potter BVL. E-ring modified steroids as novel
potent inhibitors of 17β-hydroxysteroid dehydrogenase type 1. J. Med. Chem. 48,
5749–5770, 2005.
[127] Lemmen C, Lengauer T, Klebe G. FLEXS: A method for fast flexible ligand superpo-
sition. J. Med. Chem. 41, 4502–4520, 1998.
[128] Allan GM, Bubert C, Vicker N, Smith A, Tutill HJ, Purohit A, Reed MJ, Potter BVL.
Novel, potent inhibitors of 17β-hydroxysteroid dehydrogenase type 1. Mol. Cell. En-
docrinol. 248, 204–207, 2006.
[129] Allan GM, Lawrence HR, Cornet J, Bubert C, Fischer DS, Vicker N, Smith A, Tutill HJ,
Purohit A, Day JM, Mahon MF, Reed MJ, Potter BVL. Modification of estrone at the 6,
16, and 17 positions: Novel potent inhibitors of 17β-hydroxysteroid dehydrogenase
type 1. J. Med. Chem. 49, 1325–1345, 2006.
[130] Deluca D, Moeller G, Rosinus A, Elger W, Hillisch A, Adamski J. Inhibitory effects
of fluorine-substituted estrogens on the activity of 17β-hydroxysteroid dehydroge-
nases. Mol. Cell. Endocrinol. 248, 218–224, 2006.
[131] Tremblay MR, Lin SX, Poirier D. Chemical synthesis of 16β-propylaminoacyl deriva-
tives of estradiol and their inhibitory potency on type 1 17β-hydroxysteroid dehy-
drogenase and binding affinity on steroid receptors. Steroids 66, 821–831, 2001.
[132] Poirier D, Boivin RP, Tremblay MR, Berube M, Qiu W, Lin SX. Estradiol-adenosine
hybrid compounds designed to inhibit type 1 17β-hydroxysteroid dehydrogenase. J.
Med. Chem. 48, 8134–8147, 2005.
[133] Poirier D, Chang HJ, Azzi A, Boivin RP, Lin SX. Estrone and estradiol C-16 deriva-
tives as inhibitors of type 1 17β-hydroxysteroid dehydrogenase. Mol. Cell. Endocrinol.
248, 236–238, 2006.
[134] SPROUT 6.0, SimBioSys Inc., 135 Queen’s Plate Drive, Unit 520, Toronto, Ontario,
M9W 6V1, Canada.
91
Bibliography
[135] Messinger J, Hirvela L, Husen B, Kangas L, Koskimies P, Pentikainen O, Saarenketo
P, Thole H. New inhibitors of 17β-hydroxysteroid dehydrogenase type 1. Mol. Cell.
Endocrinol. 248, 192–198, 2006.
[136] Kristan K, Krajnc K, Konc J, Gobec S, Stojan J, Lanisnik Rizner T. Phytoestrogens as
inhibitors of fungal 17β-hydroxysteroid dehydrogenase. Steroids 70, 626–635, 2005.
[137] Schuster D, Nashev LG, Kirchmair J, Laggner C, Wolber G, Langer T, Odermatt
A. Discovery of nonsteroidal 17β-hydroxysteroid dehydrogenase 1 inhibitors by
pharmacophore-based screening of virtual compound libraries. J. Med. Chem. 51,
4188–4199, 2008.
[138] Wolber G, Langer T. LigandScout: 3D pharmacophores derived from protein-bound
ligands and their use as virtual screening filters. J. Chem. Inf. Model. 45, 160–169, 2005.
[139] Wolber G, Dornhofer AA, Langer T. Efficient overlay of small organic molecules
using 3D pharmacophores. J. Comput.-Aided Mol. Des. 20, 773–788, 2007.
[140] Hermann JC, Ridder L, Mulholland AJ, Hoeltje HD. Identification of Glu166 as the
general base in the acylation reaction of class A β-lactamases through QM/MM mod-
eling. J. Am. Chem. Soc. 125, 9590–9591, 2003.
[141] Hensen C, Hermann JC, Nam K, Ma S, Gao J, Hoeltje HD. A combined QM/MM
approach to protein-ligand interactions: Polarization effects of the HIV-1 protease on
selected high affinity inhibitors. J. Med. Chem. 47, 6673–6680, 2004.
[142] Hermann JC, Hensen C, Ridder L, Mulholland AJ, Hoeltje HD. Mechanisms of antibi-
otic resistance: QM/MM modeling of the acylation reaction of a class A β-lactamase
with benzylpenicillin. J. Am. Chem. Soc. 127, 4454–4465, 2005.
[143] Hermann JC, Ridder L, Hoeltje HD, Mulholland AJ. Molecular mechanisms of an-
tibiotic resistance: QM/MM modelling of deacylation in a class A β-lactamase. Org.
Biomol. Chem. 4, 206–210, 2006.
[144] Morse PM. Diatomic molecules according to the wave mechanics. II. Vibrational
levels. Phys. Rev. 34, 57–64, 1929.
[145] Guvench O, MacKerell Jr AD. Comparison of protein force fields for molecular dy-
namics simulations. In Molecular Modelling of Proteins, vol. 443 of Methods in Molecular
Biology, A Kukol, ed., pp. 63–88. Humana Press, 999 Riverview Drive, Suite 208, To-
towa, NJ 07512 USA, 2008.
[146] van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, Berendsen HJ. Gromacs:
Fast, flexible, and free. J. Comput. Chem. 26, 1701–1718, 2005.
92
Bibliography
[147] Rost B, Yachdav G, Liu J. The PredictProtein server. Nucleic Acids Res. 32, W321–326,
2004.
[148] Rost B. Review: Protein secondary structure prediction continues to rise. J. Struct.
Biol. 134, 204–218, 2001.
[149] Needleman SB, Wunsch CD. A general method applicable to the search for similari-
ties in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453, 1970.
[150] Smith TF, Waterman MS. Identification of common molecular subsequences. J. Mol.
Biol. 147, 195–197, 1981.
[151] Shi J, Blundell TL, Mizuguchi K. FUGUE: Sequence-structure homology recognition
using environment-specific substitution tables and structure-dependent gap penal-
ties. J. Mol. Biol. 310, 243–257, 2001.
[152] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search
tool. J. Mol. Biol. 215, 403–410, 1990.
[153] Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD.
Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res.
31, 3497–3500, 2003.
[154] Poirot O, O’Toole E, Notredame C. Tcoffee@igs: A web server for computing, evalu-
ating and combining multiple sequence alignments. Nucleic Acids Res. 31, 3503–3506,
2003.
[155] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN,
Bourne PE. The Protein Data Bank. Nucleic Acids Res. 28, 235–242, 2000.
[156] Insight II version 2005, Accelrys Inc., 10188 Telesis Court, Suite 100 San Diego, CA
92121, USA.
[157] Canutescu AA, Shelenkov AA, Dunbrack RLJ. A graph-theory algorithm for rapid
protein side-chain prediction. Protein Sci. 12, 2001–2014, 2003.
[158] Holtje HD, Folkers G. Molecular modeling: Basic principles and applications, vol. 5 of
Methods and Principles in Medicinal Chemistry. VCH Verlagsgesellschaft mbH, Wein-
heim, Germany, 1997.
[159] Lindahl E. Molecular dynamics simulations. In Molecular modeling of proteins, vol. 443
of Methods in Molecular Biology, A Kukol, ed., pp. 3–23. Humana Press, 999 Riverview
Drive, Suite 208, Totowa, NJ 07512 USA, 2008.
[160] Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth
particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593, 1995.
93
Bibliography
[161] Hess B, Bekker H, Berendsen HJC, Fraaije JGEM. LINCS: A linear constraint solver
for molecular simulations. J. Comput. Chem. 18, 1463–1472, 1997.
[162] Berendsen HJC, Postma JPM, DiNola A, Haak JR. Molecular dynamics with coupling
to an external bath. J. Chem. Phys. 81, 3684–3690, 1984.
[163] van der Spoel D, Lindahl E, Hess B, van Buuren AR, Apol E, Meulenhoff PJ, Tieleman
DP, Sijbers AL, Feenstra KA, van Drunen R, Berendsen HJ. GROMACS user manual
version 3.3., 2005. http://www.gromacs.org.
[164] Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: A program to
check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291,
1993.
[165] Morris GM, Lim-Wilby M. Molecular docking. In Molecular Modelling of Proteins, vol.
443 of Methods in Molecular Biology, A Kukol, ed., pp. 365–382. Humana Press, 999
Riverview Drive, Suite 208, Totowa, NJ 07512 USA, 2008.
[166] Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved protein-
ligand docking using GOLD. Proteins 52, 609–623, 2003.
[167] Merlitz H, Burghardt B, Wenzel W. Application of the stochastic tunneling method
to high throughput database screening. Chem. Phys. Lett. 370, 68–73, 2003.
[168] Discovery Studio 2.1, Accelrys Inc., 10188 Telesis Court, Suite 100 San Diego, CA
92121, USA.
[169] McGann MR, Almond HR, Nicholls A, Grant JA, Brown FK. Gaussian docking func-
tions. Biopolymers 68, 76–90, 2003.
[170] Jain A. Surflex: Fully automatic flexible molecular docking using a molecular
similarity-based search engine. J. Med. Chem. 46, 499–511, 2003.
[171] Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an
incremental construction algorithm. J. Mol. Biol. 261, 470–489, 1996.
[172] Taylor R, Jewsbury P, Essex J. A review of protein-small molecule docking methods.
J. Comput.-Aided Mol. Des. 16, 151–166, 2002.
[173] Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP. Empirical scoring func-
tions: I. The development of a fast empirical scoring function to estimate the binding
affinity of ligands in receptor complexes. J. Comput.-Aided Mol. Des. 11, 425–445, 1997.
[174] Jain AN. Scoring noncovalent protein-ligand interactions: A continuous differen-
tiable function tuned to compute binding affinities. J. Comput.-Aided Mol. Des. 10,
427–440, 1996.
94
Bibliography
[175] Muegge I, Martin Y. A general and fast scoring function for protein-ligand interac-
tions: A simplified potential approach. J. Med. Chem. 42, 791–804, 1999.
[176] Goodford PJ. A computational procedure for determining energetically favorable
binding sites on biologically important macromolecules. J. Med. Chem. 28, 849–857,
1985.
[177] Cramer RD, Patterson DE, Bunce JD. Comparative molecular field analysis (CoMFA).
1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc. 110,
5959–5967, 1988.
[178] GRID Manual, 2004. Molecular Discovery, Via Stoppani 38, 06135 Ponte San Gio-
vanni, Perugia, Italy.
[179] Klebe G, Abraham U, Mietzner T. Molecular similarity indices in a comparative
analysis (CoMSIA) of drug molecules to correlate and predict their biological activity.
J. Med. Chem. 37, 4130–4146, 1994.
[180] Rannar S, Lindgren F, Geladi P, Wold S. A PLS kernel algorithm for data sets with
many variables and fewer objects. Part 1: Theory and algorithm. J. Chemom. 8, 111–
125, 1994.
[181] Wester M, Johnson E, Marques-Soares C, Dijols S, Dansette P, Mansuy D, Stout C.
Structure of mammalian cytochrome P450 2C5 complexed with diclofenac at 2.1 a
resolution: Evidence for an induced fit model of substrate binding. Biochemistry 42,
9335–9345, 2003.
[182] Bairoch A, Boeckmann B, Ferro S, Gasteiger E. Swiss-Prot: Juggling between evolu-
tion and stability. Brief. Bioinform. 5, 39–55, 2004.
[183] McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server.
Bioinformatics 16, 404–405, 2000.
[184] Kirton SB, Murray CW, Verdonk ML, Taylor RD. Prediction of binding modes for
ligands in the cytochromes P450 and other heme-containing proteins. Proteins 58,
836–844, 2005.
[185] Chen S, Kao YC, Laughton CA. Binding characteristics of aromatase inhibitors and
phytoestrogens to human aromatase. J. Steroid Biochem. Mol. Biol. 61, 107–115, 1997.
[186] Vihko P, Isomaa V. Method for prognosticating the progress of breast cancer and
compounds useful for prevention or treatment thereof, U.S. Patent 2006057628 (A1),
2006.
95
Bibliography
[187] Lilienkampf A, Karkola S, Alho-Richmond S, Koskimies P, Johansson N, Huhtinen
K, Vihko K, Wahala K. Synthesis and biological activity of 17β-hydroxysteroid de-
hydrogenase type 1 inhibitors based on a thieno[2,3-d]-pyrimidin-4(3H)-one core. J.
Med. Chem. 2009, submitted.
[188] Couture JF, Pereira De Jesus-Tran K, Roy AM, Cantin L, Cote PL, Legrand P, Luu-
The V, Labrie F, Breton R. Comparison of crystal structures of human type 3 3α-
hydroxysteroid dehydrogenase reveals an ”induced-fit” mechanism and a conserved
basic motif involved in the binding of androgen. Protein Sci. 14, 1485–1497, 2005.
[189] Mazumdar M, Zhou M, Zhu DW, Azzi A, Lin SX. Crystallogenesis of steroid-
converting enzymes and their complexes: Enzyme–ligand interaction studies and
inhibitor design facilitated by complex structures. Cryst. Growth Des. 7, 2206–2212,
2007.
[190] Hara A, Nakayama T, Nakagawa M, Inoue Y, Tanabe H, Sawada H. Kinetic and
stereochemical studies on reaction mechanism of mouse liver 17β-hydroxysteroid
dehydrogenases. J. Biochem. 102, 1585–1592, 1987.
[191] The Maybridge HitFinder Collection database, http://www.maybridge.com.
96
Structures, names and codes of the essential amino acids
HONH2
COOH
Serine, Ser, S
COOHH2N
NH2
Asparagine, Asn, N
COOH
NH
Proline, Pro, P
COOH
NH2
Isoleucine, Ile, I
COOH
NH2
Lysine, Lys, K
ONH2
COOH
Alanine, Ala, A
NH2
COOH
Methionine, Met, M
COOHHO
NH2
O
Glutamic acid, Glu, E
COOHHS
NH2
Cysteine, Cys, C
COOH
NH2
Valine, Val, V
COOH
NH2
Phenylalanine, Phe, F
COOH
HN
NH2
Arginine, Arg, R
H2N
NHCOOHHO
NH2
Aspartic acid, Asp, D
O
COOHH2N
NH2
O
Glutamine, Gln, Q
COOHH
NH2
Glycine, Gly, G
NH2
COOH
Histidine, His, H
HN
N COOH
NH2
Leucine, Leu, L
OH
NH2
COOH
Threonine, Thr, T
NH2
COOH
Tryptophan, Trp, W
HN
COOH
NH2
Tyrosine, Tyr, Y
HO
S
H2N