Introduction to Drug Target Identification

58
Chapter - I Introduction _________________________________________________________________________ Identification and Validation of Drug Targets 1 Chapter I INTRODUCTION . . 1.1 NEW DRUGS - WHY? In the initial stages of drug therapy, scientists and medical researchers were not aware about the targets on which these antibiotics act. Only thing that fascinated them was that these newly discovered compounds exhibited reasonable antibacterial properties. The above scientific findings propelled to isolate those compounds and use them for treating bacterial diseases. Alex ander Fleming’s discovery of antibiotic ‘Penicillin’ is considered as one of the historical milestones in medical research. The following are some of his words summarizing the findings (BMJ, 1955). A certain type of penicillium produces in culture a powerful antibacterial substance. The active agent is readily filterable and the name 'penicillin' has been given to filtrates of broth cultures of the mould. The action is very marked on the pyogenic cocci and the diphtheria group of bacilli. Penicillin is non-toxic to animals in enormous doses and is not irritant. It does not interfere with leucocytic function to a greater degree than does ordinary broth.

description

Introduction to Drug Target Identification

Transcript of Introduction to Drug Target Identification

Page 1: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

1

Chapter I

INTRODUCTION

. .

1.1 NEW DRUGS - WHY?

In the initial stages of drug therapy, scientists and medical

researchers were not aware about the targets on which these antibiotics

act. Only thing that fascinated them was that these newly discovered

compounds exhibited reasonable antibacterial properties. The above

scientific findings propelled to isolate those compounds and use them for

treating bacterial diseases. Alexander Fleming’s discovery of antibiotic

‘Penicillin’ is considered as one of the historical milestones in medical

research. The following are some of his words summarizing the findings

(BMJ, 1955).

A certain type of penicillium produces in culture a powerful

antibacterial substance.

The active agent is readily filterable and the name 'penicillin' has

been given to filtrates of broth cultures of the mould.

The action is very marked on the pyogenic cocci and the diphtheria

group of bacilli.

Penicillin is non-toxic to animals in enormous doses and is not

irritant. It does not interfere with leucocytic function to a greater

degree than does ordinary broth.

Page 2: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

2

"It is suggested that it may be an efficient antiseptic for application

to, or injection into, areas infected with penicillin-sensitive

microbes."

The discovery of penicillin in 1928 gave confidence to the medical

researchers that any bacterial disease could be treated. Penicillin was one

of the hall mark discoveries in the field of antibiotics and in fact it managed

most of the diseases of that time. Sooner its effect faded due to the

inherent capability of the microbes to confer resistance (Watson, 1958).

The resistance is found to be easily transmitted among the bacterial

species and hence new molecules/antibiotics were always a need to

combat life threatening diseases.

In the 19th century penicillin was one of the most widely used

antibiotics. In these days it is not common to find a person who has not

received it during their life time. Almost every organism responded well to

this drug. Subsequent studies carried out in 1940s explained its mode of

action on cell wall. At this stage scientists and medical researchers did not

have a molecular level understanding of the exact binding of this molecule,

whereas the modern methods of drug discovery explain how a drug

molecule binds specifically and interacts with the disease target.

The growing concern of antibiotic resistance and drug efficiency

demands discovery and development of new drugs to fight against the l ife

threatening diseases. The recent technological advancements in science

enabled rapid sequencing of genome of various organisms. The completion

Page 3: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

3

of human genome project (HGP) brought forth a paradigm shift in drug

discovery process as it provided clarity on molecular level understanding of

disease. With the completion of sequencing human and its various

pathogenic microbes, it enabled researchers to look for novel drug targets

from these genome sequences. The numbers of drug targets identified till

date are 500, while the drugs currently in use are based on only 120 drug

targets (Hopkins and Groom, 2002). The majority of existing antibiotics

utilizes a limited number of core chemical structures and targets only a few

cellular functions, such as cell wall biosynthesis, DNA replication,

transcription, and translation (Moir et al., 1999).

1.2 ANTIBACTERIAL DRUG DISCOVERY - A BRIEF HISTORY

The importance of new class of antibiotics will be clearly understood

when we analyze the origin of antibacterial drug discovery and its prevailing

status. The pharmaceutical industry owes much of its early prosperity to the

discovery of antibacterial agents. Early antibacterial agents discovered

were the sulfonamides, penicillin and streptomycin, and these were rapidly

followed by tetracyclines, isoniazid, macrolides, glycopeptides,

cephalosporins, nalidixic acid and other molecular classes. Despite its

discovery in 1928, it required a consortium of five pharmaceutical

companies (Abbott, Lederle, Merck, Chas. Pfizer and ER Squibb & Sons)

and the US Department of Agriculture to develop and produce penicillin in

the 1940s, mainly as part of the war effort during the Second World War.

The cephalosporins became popular during the 1970s, with several

Page 4: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

4

‘second’ and ‘third’ generation products entering the marketplace by the

mid-1980s.

Coincident with the growing market dominance of the third generation

cephalosporins was the emergence of the pandemic of multidrug resistant

Streptococcus aureus infections in US hospitals and Streptococcus

pneumoniae in the community. At that time, in the early 1980s, the

pharmaceutical industry began scaling back on their antibacterial drug

discovery efforts with approximately half of large US and Japanese

pharmaceutical companies ending or curtailing their efforts. Yet

antibacterial drug discovery efforts did continue at many major European

and US pharmaceutical companies through the 1990s. But since 1999 the

industry has once again pulled back from anti-infective research in an even

more concerted manner, with 10 of the 15 largest companies ending or

curtailing their discovery efforts. While this was occurring the industry has

been experiencing a series of mega-mergers leading to large scale

consolidation. This consolidation alone has resulted in a major decrease in

the hunt for novel antibacterial agents.

The rise in the levels of antibacterial drug resistance in human

pathogens is most common phenomenon. Resistance is defined as bacteria

that are not inhabited by usually achievable systematic concentration of an

agent with normal dosage schedule and /or fall in the minimum inhibitory

concentration ranges. Drug resistance is of major concern for severely ill

and hospitalized patients as therapeutic efficacy of current drugs in practice

Page 5: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

5

is declining. First clear proof of resistance to penicillin was reported by an

accidental observation in 1958 (Ley et al., 1958). Microorganisms

developing resistance towards an antibacterial substance is an inherent

mechanism. Widespread occurrence of microbial resistance coupled with

the declining efficiency of current antibiotics in practice demands discovery

and development of novel therapeutics.

Antimicrobial Availability Task Force identified six problematic

pathogens, Gram negative organisms (Acinetobacter baumannii, extended

spectrum β-lactamase (ESBL) producing Enterobacteriaceae, and

Pseudomonas aeruginosa), Gram-positive pathogens (methicillin resistant

Staphylococcus aureus (MRSA) and vancomycin resistant Enterococcus

faecium) and the filamentuous fungi Aspergillus spp as a potential threat to

the community. Of these organisms, MRSA is the organism that has

received the most attention, largely driven by clinical need rather than by

large sums of money. It is likely that interest in the other problematic

pathogens will also be driven by clinical need and not by investment to

increase awareness. Some experts consider two additional water-borne,

non-fermenting Gram-negative pathogens, namely Stenotrophomonas

maltophilia and Burkholderia cepacia, both of which are related to P.

aeruginosa, to be problematic organisms.

Multidrug-resistant strains are particularly problematic, conveying

increased mortality, longer hospital stays, and higher hospital costs over

and above the values associated with susceptible strains of these

Page 6: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

6

pathogens. Successful treatment requires a ‘hit hard and hit fast’ approach

with an antibiotic that provides coverage of these important Gram-negative

organisms, including multidrug-resistant strains. Various studies have

indicated that the frequency of multidrug-resistant isolates is increasing

worldwide. Considering the present need for discovery and development of

novel antibiotics we are already too late.

1.3 MULTIDRUG RESISTANCE - DRIVING THE NEED FOR NEW DRUGS

Increased resistance of commonly used antibiotics, a growing

prevalence of infections, and the emergence of new pathogenic organisms

challenge current use of antibiotic therapy (Rosamond and Allsop, 2000).

Recent epidemiological studies suggest an increase in healthcare

associated infections caused by gram-negative bacteria, particularly

Klebsiella spp., Escherichia coli, Pseudomonas aeruginosa, and

Acinetobacter spp. The rising incidence of drug resistance of these

pathogens presents a challenge given the few novel antimicrobial agents

under development that specifically target these organisms. Latest

developments in the areas of targets involved in bacterial virulence or

resistance against antibacterial agents have been reviewed previously

(Schmid, 1998). Bacteria have developed a variety of resistance

mechanisms coupled with the ability to mobilize the respective genetic

information between bacterial strains and species (Heinemann, 1999).

Page 7: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

7

Gram-negative non-fermenters exhibit resistance to essentially all

commonly used antibiotics, including anti-pseudomonal penicillins and

cephalosporins, aminoglycosides, tetracyclines, fluoroquinolones,

trimethoprim-sulfamethoxazole, and carbapenems. Polymyxins are the

remaining antibiotic drug class with fairly consistent activity against

multidrug-resistant strains of P aeruginosa, Acinetobacter spp, and

S. maltophilia. A variety of resistance mechanisms have been identified in

P aeruginosa and other gram-negative non-fermenters, including enzyme

production, over expression of efflux pumps, porin deficiencies, and target-

site alterations. Multiple resistance genes frequently coexist in the same

organism. Multidrug resistance in gram-negative non-fermenters makes

treatment of infections caused by these pathogens both difficult and

expensive. Improved antibiotic stewardship and infection-control measures

will be needed to prevent or slow down the emergence and spread of

multidrug-resistant, non-fermenting gram-negative bacilli in the healthcare

setting, (Lautenbach and Polk, 2007; McGowan, 2006).

Knowledge of the clinical and economic impact of antimicrobial

resistance is useful to influence programs and behavior in healthcare

facilities, to guide policy makers and funding agencies, to define the

prognosis of individual patients and to st imulate interest in developing new

antimicrobial agents and therapies. A recent study showed that there is an

association between antimicrobial resistance in Staphylococcus aureus,

Enterococci and Gram-negative bacilli and increases in mortality, morbidity,

Page 8: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

8

length of hospitalization and cost of healthcare. Patients with infections due

to antimicrobial-resistant organisms have higher costs (US $ 6,000-30,000)

than do patients with infections due to antimicrobial-susceptible organisms;

the difference in cost is even greater when patients infected with

antimicrobial-resistant organisms are compared with patients without

infection, (Maragakis et al., 2008). Delivering healthcare with affordate cost

is need of the hour as the increased healthcare care is already rising due to

different factors.

1.3.1 Molecular mechanism of drug resistance

Development of resistance limits usefulness of effective drugs and

hence poses a major threat to the pharmaceutical industry. Over the past

two decades understanding the mechanisms of drug resistance has

become a central issue as its importance in medicine has assumed ever -

increasing significance. The following table shows the various origin of

antimicrobial resistance. Understanding the origin of resistance will aid in

avoiding potential pitfalls while developing a new drug for a specific

disease.

Page 9: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

9

Table 1

Origins of Intrinsic and Acquired Resistance

S. No.

Type Duration of resistance

Frequency of resistance

within the population

Intrinsic resistance

1. Absence of target site Permanent All cells

2. Species-specific structure of target site

Permanent All cells

3. High detoxication capacity, arising from:

a. tissue-specific function Permanent All cells

b. ontogenic variations Variable All cells

c. sex-specific differences Permanent All cells

d. population polymorphisms Permanent Variable

e. self defence Permanent All cells

f. high repair capacity Permanent All cells

4. Low drug delivery Variable Variable

5. Cell cycle effects Variable Variable

6. Adaptive change Temporary All cells

7. Stress response Temporary All cells

Acquired resistance

1. Natural selection Permanent Rare

2. Constitutive adaptive change Permanent Rare

3. Constitutive stress response Permanent Rare

4. Gene transfer Required continued selection Rare

5. Gene amplification Required continued selection Rare

Source: John Hayes and Roland Wolf (1990)

Page 10: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

10

Intrinsic drug resistance

The term ‘Intrinsic resistance’ is used to describe the situation where

an organism, or cell, possesses a characteristic 'feature' which allows all

normal members of the species to tolerate a particular drug or chemical

environment. In this case, the 'feature' responsible for resistance is an

inherent, or integral, property of the species that has arisen through the

processes of evolution.

Mechanisms of intrinsic resistance

The phenomenon of intrinsic resistance can be due to either the

presence or the absence of a biochemical 'feature' (Table 2). This may, for

example, be the structure of the cell envelope or membrane, the existence

of a drug transport protein, the absence of a metabolic pathway, the

presence of a drug-metabolizing enzyme, the structure of the drug target

site and the expression of specific stress response proteins or high repair

capacity.

Self protection mechanism associated with intrinsic drug resistance

Many organisms survive in the environment through their ability to

produce chemicals which are toxic or distasteful to their predators or their

competitors. As a consequence, they require their own defence against the

noxious chemicals they produce. Studies on the antibiotic-producing micro

organisms such as the various species of Streptomyces provide good

examples of this form of intrinsic drug resistance. The mechanisms used by

Page 11: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

11

organisms to protect themselves against their own antibiotic products were

divided into two types, firstly, resistance involving inactivation of antibiotics

such as streptomycin and neomycin by the phosphotransferases and

acetyltransferases and secondly, resistance resulting from modification of

potential target sites within the organism (Cundliffe 1984). For example, the

ribosomal RNA is protected by methylation in the erythromycin producer

Streptomyces erythraeus.

Chemically-induced adaptive change and intrinsic resistance

Drugs and a wide variety of toxic agents (e.g. radiation, osmotic

shock and heat shock) provoke many biochemical changes in cells that

allow them to overcome the toxic effects of either the same or other

compounds. In some circumstances this ability to resist chemical insult

arises immediately following administration of the drug or, alternatively,

there may be a significant time lag following exposure to the drug before

the adaptive process is manifest.

Physiological stress response and intrinsic resistance

Environmental factors, other than drugs, can, through the ability to

stress cells, elicit an adaptive response that confers resistance against

chemicals. Phenomena such as heat, anoxia, viral infection, trauma, UV

irradiation, pH, osmotic shock and oxidative stress stimulate a genetic

reflex in all cells that is 'designed' to confer tolerance against subsequent

exposure to the same physiological insult. Prokaryotes have at least four

Page 12: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

12

major regulations which are induced by stress, namely, the SOS response

(Walker, 1985), the adaptive response to alkylating agents (Samson and

Cairns, 1977; Demple et al., 1985), the oxy-R network (Christman et al.,

1985; Storz et al., 1990) and the heatshock response (Lindquist, 1986;

Carper et al., 1987).

Acquired drug resistance

The term ‘acquired resistance’ is used to describe the case where a

resistant strain, or cell line, emerges from a population that was previously

drug-sensitive. Three major types of genetic change can be envisaged:

1. mutations and amplifications of specific genes directly in vivo

mutations and amplifications of specific genes directly involved in a

protective pathway,

2. mutations in genes which regulate stress-response processes and

lead to the altered expression of large numbers of proteins, and

3. gene transfer.

These types of change are of course not mutually exclusive, and

examination of the multiple changes that are frequently seen in resistant

tumour cell lines suggests that several mechanisms can operate

simultaneously.

Page 13: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

13

Natural selection and acquired resistance

The distinction between acquired resistance through natural selection

and intrinsic drug resistance lies in the frequency with which the mutated

gene is observed in the 'wild type' population.

Drug-mediated genetic changes and acquired resistance

Herbicides, insecticides or antimicrobials are not mutagenic.

However, many drugs used in cancer chemotherapy are mutagens

providing the selection pressure for resistance, can significantly increase

the frequency of mutations that will produce resistant cells. This is probably

greatly potentiated by the inherent genetic instability of cancer cells. Such

effects are exemplified by the significant increase in the frequency of DNA

amplification following the exposure of tumour cells to mutagens such as

monofunctional and bifunctional alkylating agents and UV. irradiation

(Connors, 1984; Stark, 1986). It is technically difficult to demonstrate

whether resistant cells in tumours arise from drug-mediated mutations or

were present before chemotherapy was initiated.

Page 14: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

14

Table 2

Examples of acquired drug resistance

Example Organism Resistance to Procedure Type of resistance

Bacterial drug resistance Escherichia coli

Chloramphenicol, ampicillin

Exposure to drug Gene transfer (+ natural selection)

Bacterial drug resistance Serratia marcescens

Fosfomycin Exposure to drug Gene transfer (+ natural selection)

Preneoplastic hepatocyte nodules

Rat Toxins, carcinogens Carcinogen exposure Carcinogen-induced stress response

Persistant hepatocyte nodules Rat Toxins, carcinogens Carcinogen exposure Natural selection: altered expression of drug metabolizing enzymes

Oxy RI network (adaptive response to oxidative stress)

Salmonella typhimurium

Peroxides, ethanol In vitro selection of cell line Constitutive overexpression of a stress response

ampC, R and D genes (adaptive response to cephalosporins)

Citrobacter freundii

Cefuroxime, cefotaxime, cetazidime

In vitro selection of cell line Constitutive overexpression of an adaptive response

Ada gene (adaptive response to alkylating agents)

Escherichia coli

N-Methyl-N-nitrosourea N-methyl-N-nitro-N- nitrosoguanidine

In vitro selection of cell line Constitutive overexpression of an adaptive response

Multidrug resistance

Tumour cell lines

Adriamycin, vincristine, actinomycin D

Stepwise exposure to increasing concentrations of cytotoxic drug

Amplification of P-glycoprotein genes

Page 15: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

15

Example Organism Resistance to Procedure Type of resistance

Alkylating agent resistance Tumour cell lines

Alkylating agents Stepwise exposure to increasing concentrations of cytotoxic drug

Overexpression of drug metabolizing enzymes

DNA gyrase mutants Escherichia coli

Nalidixic acid In vitro exposure to drug Natural selection

Penicillin binding protein mutants

Escherichia coli

Penicillin Exposure to drug Natural selection

Acetylcholinesterase mutants House flies Organophosphorus Exposure to drug Exposure to drug Natural selection

Source: John Hayes and Roland Wolf (1990)

Page 16: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

16

1.4 CONCERNS FOR DRUG DISCOVERY AND DEVELOPMENT

The process of drug development begins with the target identification

and eventually leads to the development of final medication. Drug discovery

and development is an expensive and laborious incremental process. The

main objective of this developmental effort is to identify a molecule with

desired effect to cure a specific disease. Also it should establish quality,

safety and efficacy for treating the patients without any undesirable side

effects (Snodin, 2002).

Currently the developmental cost for bringing a new molecule to

market costs around $800 million USD. It takes nearly 12 years for a drug

to progress from bench to market (EMBO Reports, 2004). The drug

discovery process has numerous technical bottlenecks and the molecule

under research has high risk failure at any stage of the development

process. In spite of the growth in drug discovery technologies, the number

of drugs that has crossed the FDA approval is very less. Furthermore, no

new chemical classes of active antibiotics have been successfully

introduced into the clinic for over 30 years. For example, of 5000

compounds that enter pre-clinical testing approximately five compounds are

tested in human trails of which only one receives the approval for

therapeutic purpose. Since the development costs have increased, the

number of companies venturing into R/D spending has decreased

drastically. However, effective use of the new genomic technologies and

Page 17: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

17

available data resource accelerates the process of drug discovery and

prevents potential pitfalls in the drug discovery pipeline.

1.4.1 Stages of drug discovery

The cost and time taken to design develop and release new drugs to

the market have continued to rise over recent times (Grabowski et al.,

1990; Di Masi, 2002) and also the number of new drug approvals has

declined drastically (Frantz and Smith, 2003). The pharmaceutical industry

is keen on reducing the drug candidate attrition throughout the drug

discovery and development process. Numerous drugs with reasonable

biological activities fail at the clinical studies. Earlier testing especially

through wet laboratory or in silico protocols can avoid such pitfalls in the

drug development.

Fig. 1: Modern Day Drug Discovery Pipeline

Page 18: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

18

The first step is to determine an assay for the receptor or the target.

An assay is a test to assess the positive binding of a molecule (drug) to the

target receptor. Usually a pharmaceutical company will first screen their

entire corporate database of known compounds as the compound in the

database is usually very well characterized. Also, synthetic methods will be

known for this compound, and patent protection is often present. This

enables the company to rapidly prototype a candidate ligand whose

chemistry is well known and within the intellectual property of the company.

If none of these compounds from their database match the target then they

may look for a compound which will fit to their receptor. The molecule

which successfully binds with the target is termed as a lead compound. The

next step is to study the receptors interactions with the ligand molecule.

This would involve both in silico and in vitro analysis to find the binding

residues involved in the ligand-receptor association. The 3D structure of

the ligand-receptor complex provides a clear perspective on the ligand-

receptor interaction.

1.5 DETERMINATION OF THE CRYSTAL STRUCTURE

If the receptor is water soluble, there is a chance that x-ray

crystallographic analysis can be employed to determine the three-

dimensional structure of the ligand bound to the receptor at the atomic

level. X-ray crystallography is a very powerful tool for it allows scientists to

directly visualize a snapshot of the individual atoms of the ligand as they

reside within the receptor. This snapshot is referred to as a crystal

Page 19: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

19

structure of the ligand-receptor complex. Unfortunately, not all complexes

can be analyzed in this manner. However, if a crystal structure can be

determined, a strategy can then be developed based upon this

characterization to improve and optimize the binding of the lead

compound. From this point onward, a cycle of iterative chemical refinement

and testing continues until a drug is developed that undergoes clinical

trials. The techniques used to refine drugs are combinatorial chemistry and

structure based drug design.

1.5.1 X-ray crystallography and drug discovery

The concept of applying X-ray crystallography in drug discovery

emerged more than 30 years ago as the first 3D structures of proteins were

determined. A typical example for this include the synthesis of ligands of

haemoglobin to decrease sickling (Beddell et al., 1976; Goodford et al.,

1980), the chemical modification of insulin to increase half lives (Blundell,

1972), and the design of serine proteases inhibitors to control blood

clotting. In spite of the promising results most pharmaceutical companies

considered X-ray crystallography too expensive and time consuming to

bring ‘in house’ and for a time most activity remained in academia. Within a

decade, a radical change in drug design had begun, incorporating the

knowledge of the three dimensional structures of target proteins into the

design process. Although structures of the relevant drug targets were

usually not available directly from X-ray crystallography, comparative

models based on homologues proved useful in defining topographies of the

Page 20: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

20

complementary surfaces of ligands and their protein targets, and began to

be exploited in lead optimization in the 1980s (Blundell et al., 1983;

Blundell, 1996; Campbell, 2000).

Sooner crystal structures of key drug targets became available; AIDS

drugs such as Agenerase and Viracept were developed using the crystal

structure of HIV protease (Lapatto et al., 1989) and the influenza drug

Relenza was designed using the crystal structure of neuraminidase

(Varghese, 1999). More than 40 drugs originating from structure-based

design approaches have now entered clinical trials (Hardy and Malikayil,

2003), and seven of these had achieved regulatory approval and been

marketed as drugs by mid-2003. These successes had often led the

pharmaceutical segments to explore design and development of drugs

applying in silico approaches.

Protein structure can influence drug discovery at every stage in the

design process. Classically it has been exploited in lead optimization, a

process that uses structure to guide the chemical modification of a lead

molecule to give an optimized fit in terms of shape, hydrogen bonds and

other non-covalent interactions with the target. Protein structure can also

be used in target identification and selection (the assessment of the

‘druggability’ or tractability of a target). Traditionally, this has involved

homology recognition assisted by knowledge of protein structure; but now

structural genomics programs are seeking to define representative

structures of all protein families, allowing proposals of binding regions and

Page 21: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

21

molecular functions. More recently, X-ray crystallography has been used to

assist the identification of hits by virtual screening and more directly in the

screening of chemical fragments. The key roles of structural biology and

bioinformatics in lead optimization remain as important as ever (Whittle and

Blundell, 1994; Lombardino and Lowe, 2004). For protein which cannot be

crystallized, it is not possible to elucidate the structure through X-ray

crystallography. These structures can be predicted with high level of

accuracy using protein modeling methods. The protein modeling is a widely

accepted phenomenon as it produces highly reliable 3D structures and it is

of high importance nowadays in the drug discovery industries.

1.5.2 Protein Modeling

The process of evolution has resulted in the production of DNA

sequences that encode proteins with specific functions. In the absence of a

protein structure that has been determined by X-ray crystallography or

nuclear magnetic resonance (NMR) spectroscopy, researchers can predict

the three-dimensional structure using protein modeling. This method uses

experimentally determined protein structures (templates) to predict the

structure of another protein that has a similar amino acid sequence (target).

Although protein modeling may not be as accurate at determining a

protein's structure as experimental methods, it is still extremely helpful in

proposing and testing various biological hypotheses. This technique also

provides a starting point for researchers wishing to confirm a structure

Page 22: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

22

through X-ray crystallography and NMR spectroscopy. Because the

different genome projects are producing more sequences and because

novel protein folds and families are being determined, protein modeling will

become an increasingly important tool for scientists working to understand

normal and disease-related processes in living organisms.

1.5.2.1 The Four Steps of Protein Modeling (Lorenza, 2009)

Identify the proteins with known three-dimensional structures that are

related to the target sequence

Align the related three-dimensional structures with the target sequence

and determine those structures that will be used as templates

Construct a model for the target sequence based on its alignment with

the template structure(s)

Evaluate the model against a variety of criteria to determine if it is

satisfactory

Page 23: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

23

Fig. 2: Protein modeling steps

1.5.2.2 Comparative or homology protein structure modeling

Homology or comparative protein structure modeling constructs a

three-dimensional model of a given protein sequence based on its similarity

to one or more known structures. The first class of protein structure

prediction methods, including threading and comparative modeling, rely on

detectable similarity spanning most of the modeled sequence and at least

one known structure. The second class of methods, de novo or ab initio

methods, predict the structure from sequence alone, without relying on

similarity at the fold level between the modeled sequence and any of the

known structures. Despite progress in ab initio protein structure prediction,

comparative modeling remains the most reliable method to predict the 3D

Page 24: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

24

structure of a protein with an accuracy that can be comparable to a low-

resolution, experimentally determined structure.

1.6 PROTEIN MODELING AND DRUG DISCOVERY

Advances in bioinformatics and protein modeling algorithms, in

addition to the enormous increase in experimental protein structure

information, have aided in the generation of databases that comprise

homology models of a significant portion of known genomic protein

sequences. Currently, 3D structure information can be generated for up to

56% of all known proteins. However, there is considerable controversy

concerning the real value of homology models for drug design. Despite the

numerous uncertainties that are associated with homology modeling, recent

research has shown that this can be used to significant advantage in the

identification and validation of drug targets, as well as for the identification

and optimization of lead compounds.

Homology model-based drug design has been applied to epidermal

growth factor receptor tyrosine kinase protein (Ghosh et al., 2001), Bruton’s

tyrosine kinase (Mahajan et al., 1999), Janus kinase 3 (Sudbeck et al.,

1996) and human aurora 1 and 2 kinases (Vankayalapati et al., 2003).

Traditionally, the crucial impasse in the industry’s search for new drug

targets was the availability of biological data. Now with the advent of

human genomic sequence, bioinformatics offers several approaches for the

prediction of structure and function of proteins on the basis of sequence

Page 25: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

25

and structural similarities. The protein sequence>structure>function

relationship is well established and reveals that the structural details at

atomic level help understand molecular function of proteins. Impressive

technological advances in areas such as structural characterization of

biomacromolecules, computer sciences and molecular biology have made

rational drug design feasible and present a holistic approach.

The protein modeling being a computational approach generates the

3D structure of a receptor with high accuracy in a short duration. Also it is

possible to study the various binding pockets of the receptor (protein) and

ligand by molecular docking. These structures are of high importance for

screening the new chemical entities by in silico methods.

1.6.1 Multidomain Protein Targets

One of the great internal contradictions of drug discovery in practice

is that most regulatory proteins in man, the obvious targets for new drugs,

are complex proteins that are often multidomain and very usually

components of multiprotein systems. A domain represents a complete

functional unit. A protein may have one or more domains. Most of the focus

in the pharmaceutical industry is on the active sites of monomeric proteins.

Many proteins in the higher eukaryotes are large and contain multiple

domains. A typical example is the DNA protein kinase (DNA-PK), a key

molecule in non-homologous end joining, which signals the assembly of the

multiprotein system involved in the repair of double strand breaks (Smider

Page 26: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

26

et al., 1994; Taccioli et al., 1994). This protein is composed of a large

catalytic subunit and a regulating heterodimer Ku70 and Ku80. DOMINANT,

a program has been written to deconvolute protein structures into their

constituent domains in order that domains and domain boundaries can be

classified (Brewerton, 2004). For an input protein structure, DOMINANT

checks the existing domain database using a structure comparison

procedure to identify any recurrent domains, and then uses a procedure to

identify domains from the spatial separation of secondary structures to

deconvolute the remaining structure. Programs like DOMINANT will be

helpful in identifying multi domain protein and further assessing them for

druggability.

1.7 IN SILICO - ITS ORIGIN AND REVOLUTION

The term ‘in silico’ is a modern word usually used to mean

experimentation performed by computer and is related to the more

commonly known biological terms in vivo and in vitro. The history of the ‘in

silico’ term is poorly defined, with several researchers claiming their role in

its origination. However, some of the earliest published examples of the

word include the use by Sieburg (1990) and Danchin et al. (1991).

Informatics is a real aid to discovery when analyzing biological

functions. We could reiterate this for drug discovery, which is a hugely

complex information handling and interpretation exercise. With so much

information to process, we need to be able to discover the shortcuts or the

Page 27: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

27

rules that will point us as quickly as possible to the targets and molecules

that are likely to proceed to the clinic then onto the market. It has also been

suggested that if we are to build on the advances of the human genome,

we need to integrate computational and experimental data, with the aim of

initiating in silico pharmacology linking all data types. This could change

the way the pharmaceutical industry discovers drugs using data to enable

simulations; however, there may still be significant gaps in our knowledge

beyond genes and proteins (Whittaker, 2003). Structure-based methods are

broadly used for drug discovery but these are just a beginning, for example

in neuropharmacology, it is expected that ligand-receptor interaction kinetic

models will need to be integrated with network approaches to understand

fully neurological disorders, in general this could be applied more widely to

pharmacology (Aradi and Erdi, 2006). Basically, there are two outcomes

when bioactive compounds and biological systems interact (Testa and

Kramer, 2006). Note that ‘biological system’ is defined here very broadly

and includes functional proteins (for example, receptors), monocellular

organisms and cells isolated from mult icellular organisms, isolated tissues

and organs, multicellular organisms and even populations of individuals, be

they unicellular or multicellular. As for the interactions between a drug and

a biological system, they may be simplified to ‘what the compound does to

the biosystem’ and ‘what the biosystem does to the compound.’ A drug that

acts on a biological system can elicit a pharmacological and/or toxic

response, in other words a pharmacodynamic (PD) event. With the

Page 28: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

28

computational methods decision making and virtually simulating every facet

of drug discovery and development is a reality (Swaan and Ekins, 2005)

1.7.1 In silico drug discovery

Applying computational methods and techniques in the drug

discovery and development process is more appreciated and it is gaining

popularity among the pharmaceutical companies. In silico application

reduces the time and resource requirements of chemical synthesis and

biological testing. The utilities of computational application in drug

discovery include hit identification, lead identification and optimizing lead.

Before the introduction of genomic sciences, the drug discovery processes

have been guided mostly by chemistry and pharmacology. With the

completion of human genome project coupled with the molecular level

understanding of the diseases, biology is the major driving force of this

discovery process.

1.7.1.1 Chemo genomics approach

Chemogenomics approach aims at studying the effect of wide array of

small molecule ligands on a wide array of macro molecular targets. Human

genome has approximately 3000 druggable targets of which only 800

proteins are currently investigated by pharmaceutical companies. Chemo

genomic approach attempts to match these potential targets with the ligand

space. It depends on these components like compound library,

representative biological system and reliable output (Gene/protein

Page 29: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

29

expression data). This approach considers the fact that compounds sharing

some chemical similarity also share targets and targets sharing similar

ligands should share similar patterns or binding sites.

1.7.2 Virtual Screening and In silico Drug Targets

Assessment of 617 approved oral drugs in two-dimensional (2D)

molecular property space (molecular weight versus cLogP) showed that

many of them had cLogP 45 and MW 4500. In spite of this, their associated

targets were potentially druggable but had yet to realize their potential

(Paolini et al., 2006). A recent analysis using 48 molecular 2D descriptors

followed by principal component analysis of over 12,000 anticancer

molecules representing cancer medicinal chemistry space, showed that

they populated a different space broader than hit-like space and orally

available drug-like space. This would indicate that in order to find

molecules for anticancer targets in commercially available databases,

different rules are required other than those widely used for drug-likeness,

as they may unfortunately filter out possible clinical candidates (Lloyd et

al., 2006).

A representative of this inverse docking approach is INVDOCK, which

was recently applied for identifying potential adverse reactions using a

database of 147 proteins related to toxicities (DART). This method has

been recently demonstrated with 11 marketed anti-HIV drugs resulting in

reasonable accuracy against the DNA polymerase beta and DNA

Page 30: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

30

topoisomerase I (Ji et al., 2006). The public availability of data on drugs

and drug-like molecules may make the analyses described above possible

for scientists outside the private sector. For example, chemical repositories

such as DrugBank (http://redpoll.pharmacy.ualberta.ca/drugbank/) (Wishart

et al., 2006), PubChem (http://pubchem.ncbi.nlm.nih.gov/), KiDB

(http://kidb.bioc.cwru.edu/) (Roth et al., 2004; Strachan et al., 2006) and

others consist of a wealth of target and small molecule data that can be

mined and used for computational pharmacology approaches.

Nuclear receptors: Nuclear receptors constitute a family of ligand-

activated transcription factors of paramount importance for the

pharmaceutical industry since many of its members are often considered as

double-edged swords (Shi, 2006). On the one hand, because of their

important regulatory role in a variety of biological processes, mutations in

nuclear receptors are associated with many common human diseases such

as cancer, diabetes and osteoporosis and thus, they are also considered

highly relevant therapeutic targets. On the other hand, nuclear receptors

act also as regulators of some the CYP enzymes responsible for the

metabolism of pharmaceutically relevant molecules, as well as transporters

that can mediate drug efflux, and thus they are also regarded as potential

therapeutic antitargets.

Examples of the use of target-based virtual screening to identify

novel small molecule modulators of nuclear receptors have been recent ly

reported. Using the available structure of the oestrogen receptor subtype a

Page 31: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

31

(ERa) in its antagonist conformation, a homology model of the retinoic acid

receptor a (RARa) was constructed. Using this homology model, virtual

screening of a compound library lead to the identification of two novel

RARa antagonists in the micromolar range. The same approach was later

applied to discover 14 novel and diverse micromolar antagonists of the

thyroid hormone receptor (Schapira et al., 2000). By means of a procedure

designed particularly to select compounds fitting onto the LxxLL peptide-

binding surface of the oestrogen receptor, novel ERa antagonists were

identified (Shao et al., 2004). The discovery of three low micromolar hits for

ERb displaying over 100-fold binding selectivity with respect to ERa was

also recently reported using database screening (Zhao and Brinton, 2005).

A final example reports the identification and optimization of a novel family

of peroxisome proliferator-activated receptors-g partial agonists based

upon pyrazol-4-ylbenzenesulfonamide after employing structure-based

virtual screening, with good selectivity profile against the other subtypes of

the same nuclear receptor group (Lu et al., 2006).

Antibacterials

Twenty deoxythymidine monophosphate analogues were used along

with docking to generate a pharmacophore for Mycobacterium tuberculosis

thymidine monophosphosphate kinase inhibitors with the Catalyst software.

A final model was used to screen a large database spiked with known

inhibitors. In addition, the model was used to rapidly screen half a million

Page 32: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

32

compounds in an effort to discover new inhibitors (Gopalakrishnan et al.,

2005).

Antivirals

Neuroamidase is a major surface protein in influenza virus.

A structure-based approach was used to generate Catalyst

pharmacophores and these in turn were used for a database search and

aided the discovery of known inhibitors. The hit lists were also very

selective (Steindl and Langer, 2004). Utilizing this screening to design

antivirals could help in managing the major epidemics and pandemics.

Usually during an outbreak of a pandemic there is very less chance for

surveillance as the discovery process takes time. Screening for compounds

with activity will lead to rapid identification and to start an appropriate

control measure.

Human rhinovirus 3C protease is an antirhinitis target. A structure-

based pharmacophore was developed initially around AG 7088 but this

proved too restrictive. A second pharmacophore was developed from seven

peptidic inhibitors using the Catalyst HIPHOP method. This hypothesis was

useful in searching the world drug index database to retrieve compounds

with known antiviral activity and several novel compounds were selected

from other databases with good fits to the pharmacophore, indicative that

they would be worth testing although these ultimate testing validation data

were not presented (Steindl et al., 2005b).

Page 33: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

33

Human rhinovirus coat protein is another target for antirhinitis.

A pharmacophore was generated from the structure and shape of a known

inhibitor and tested for its ability to find known inhibitors in a database.

Ultimately, after screening the Maybridge database, 10 compounds were

suggested that were then docked and scored. Six compounds were tested

and found to inhibit viral growth. However, the majority of them was found

to be cytotoxic or had poor solubility (Steindl et al., 2005a). The Ligand

Scout approach was tested on the rhinovirus serotype 16 and was able to

find known inhibitors in the PDB (Wolber and Langer, 2005). The SARS

coronavirus 3C-like proteinase has been addressed as a potential drug

design target. A homology model was built and chemical databases were

docked into it. A pharmacophore model and drug-like rules were used to

narrow the hit list. Forty compounds were tested and three were found with

micromolar activity, the best being calmidazolium at 61 mM (Liu et al.,

2005), perhaps a starting point for further optimization.

A pharmacophore has also been developed to predict the hepatitis

C virus RNA-dependent RNA polymerase inhibition of diketo acid

derivatives. A Catalyst HypoGen model was derived with 40 molecules with

activities over three log orders to result in a five-feature pharmacophore

model. This was in turn tested with 19 compounds from the same data set

as well as nine diketo acid derivatives, for which the predicted and

experimental data were in good agreement (Di Santo et al., 2005).

Page 34: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

34

1.7.3 Protein-protein interactions

Protein-protein interactions are key components of cellular signalling

cascades, the selective interruption of which would represent a sought after

therapeutic mechanism to modulate various diseases (Tesmer, 2006).

However, such pharmacological targets have been difficult for in silico

methods to derive small molecule inhibitors owing to generally quite

shallow binding sites. The G-protein Gbg complex can regulate a number of

signalling proteins via protein-protein interactions. The search for small

molecules to interfere with the Gbg-protein-protein interaction has been

targeted using FlexX docking and consensus scoring of 1990 molecules

from the NCI diversity set database (Bonacci et al., 2006). After testing 85

compounds as inhibitors of the Gb1g2-SIRK peptide, nine compounds were

identified with IC50 values from 100 nM to 60 mM. Further substructure

searching was used to identify similar compounds to one of the most potent

inhibitors to build a SAR. These efforts may eventually lead to more potent

lead compounds.

A structure-based catalyst pharmacophore was developed for

acetylcholine esterase, which was subsequently used to search a natural

product database. The strategy identified scopoletin and scopolin as hits

and were later shown to have moderate in vivo activity (Rollinger et al.,

2004). The same database was also screened against cyclooxygenase

(COX)-1 and (COX)-2 structure-based pharmacophores, leading to the

identification of known COX inhibitors. These represent examples where a

Page 35: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

35

combination of ethnopharmacological and computational approaches may

aid drug discovery (Rollinger et al., 2005).

Homology models for the human 12-LOX and 15-LOX have also been

used with the flexible ligand docking programme Glide (Schrodinger Inc.) to

perform virtual screening of 50 000 compounds. Out of 20 compounds

tested, 8 had inhibitory activity and several were in the low micromolar

range (Kenyon et al., 2006).

1.7.4 Kinases

The kinases represent an attractive family of over 500 targets for the

pharmaceutical industry, with several drugs approved recently. Kinase

space has been mapped using selectivity data for small molecules to create

a chemogenomic dendrogram for 43 kinases that showed the highly

homologous kinases to be inhibited similarly by small molecules (Vieth et

al., 2004). Drug-metabolizing enzymes and transporters: Mathematical

models describing quantitative structure-metabolism relationships were

pioneered by (Hansch et al., 1968) using small sets of similar molecules

and a few molecular descriptors. Later, Lewis and co-workers provided

many QSAR and homology models for the individual human CYPs (Lewis,

2000). As more sophisticated computational modelling tools became

available, there is a steep growth in the number of available models (De

Groot and Ekins, 2002; De Graaf et al., 2005; De Groot, 2006) and the size

of the data sets they encompass. Some more recent methods are also

Page 36: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

36

incorporating water molecules into the binding sites when docking

molecules into these enzymes and these may be important as hydrogen

bond mediators with the binding site amino acids (Lill et al., 2006). Docking

methods can also be useful for suggesting novel metabolites for drugs. A

recent example used a homology model of CYP2D6 and docked

metoclopramide as well as 19 other drugs to show a good correlation

between IC50 and docking score r2¼0.61 (Yu et al., 2006).

A novel aromatic N-hydroxy metabolite was suggested as the major

metabolite and confirmed in vitro. Now that several crystal structures of the

mammalian CYPs are available, they have been found to compare qui te

favourably to the prior computational models (Rowland et al., 2006).

However, for some enzymes like CYP3A4, where there is both ligand and

protein promiscuity, there may be difficulty in making reliable predictions

with some computational approaches such as docking with the available

crystal structures (Ekroos and Sjogren, 2006). Hence, multiple

pharmacophores or models may be necessary for this and other enzymes

(Ekins et al., 1999), as it has been indicated by others more recently (Mao

et al., 2006).

Sulfotransferases, a second class of conjugating enzymes, have been

crystallized (Dajani et al., 1999; Gamage et al., 2003) and a QSAR method

has also been used to predict substrate affinity to SULT1A3 The

computational modelling of drug transporters has been thoroughly reviewed

by numerous groups (Zhang et al., 2002a, b; Chang and Swaan, 2005).

Page 37: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

37

Various transporter models have also been applied to database searching

to discover substrates and inhibitors (Langer et al., 2004; Pleban et al.,

2005; Chang et al., 2006b) and increase the efficiency of in vitro screening

or enrichment over random screening.

Receptors: There are more than 20 different families of receptors that

are present in the plasma membrane, altogether representing over 1000

proteins of the receptorome (Strachan et al., 2006). Receptors have been

widely used as drug targets and they have a wide array of potential ligands.

However, it should be noted that to date we have only characterized and

found agonists and antagonists for a small percentage of the receptorome.

1.8 DRUG TARGETS

Wikipedia defines drug target as "A biological target is a biopolymer

such as a protein or nucleic acid whose activity can be modified by an

external stimulus".

It has been estimated that current drug therapies are directed at less

than 500 targets. With unprecedented growth in medical sciences and

technology only approximately 500 drug targets had been reported till 2000.

Considering that the human genome contains some 30,000 genes, it is

possible that its study could lead to at least 3,000 to 5,000 potential new

targets for therapy. Currently, predominant candidates include G protein-

coupled receptor families and other receptors and related molecules, a

wide range of enzymes including proteases, kinases and phosphatases,

Page 38: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

38

hormones, growth factors, chemokines, soluble receptors and related

molecules, and many others. Exactly the same principles are being applied

to the search for agents to interfere with key biochemical pathways in

pathogens, based on information which is being obtained from the

pathogen genome project (WHO Reports, 2002).

1.8.1 Characteristics of an ideal drug target (Pathogenic Organisms)

The genome data must be analyzed by in vitro and in silico means to

nail down drug targets for developing new drugs. The following are the

characteristic features of an ideal target. The criteria for the ideal target

should fulfill the following four consideration.

Essentiality: The target should be essential for the growth,

replication and survival of the organism.

Selectivity: The target should not have clear orthologs in the human

host. This aspect is referred to as selectivity.

Spectrum: The target should be conserved in a number of

pathogens, providing adequate spectrum for any potential inhibitors.

Functionality: Functionality of the target has to be determined to

detect the inhibitors of the target.

1.8.2 Identifying Drug Targets

Virulence genes as drug targets

The complete genome data sets also spur early identification of

virulence genes. These genes can be identified either by in vitro expression

Page 39: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

39

technology or by DNA micro arrays. Extensive analysis coupled with the

comparison of pathogenic and non-pathogenic microbes will reveal the

pathogenic islands which encodes the virulent factors. Most often, these

islands differ from the rest of the genome in certain parameters like GC

content, codon usage and gene density. The protein encodes from these

pathogenic islands are thrust areas for alternative targets.

Species specific genes as drug targets

Peer Bork and his coworkers devised an interesting approach for the

prediction of potential drug targets. They designate this approach as

“Differential genome display”. The approach relied on the fact that

pathogenic organism codes for fewer proteins than free l iving organisms;

and those proteins which is present in pathogen and absent in free living

organisms are considered potential drug targets.

Effective drug targets are selected based on several important

criteria: they must be necessary to bacterial survival or growth, highly

conserved in either a broad- or narrow- range of pathogens, absent or very

different in humans, and understood biochemically (Rosamond and Allsop,

2000).

Microbial genomics and drug discovery

Sequencing technique enabled rapid sequencing and it is still

assisted by the computational tools to perform automated annotation of

these freshly sequenced genome data. Researchers quickly mine these

Page 40: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

40

data sets for exploring novel targets for both antimicrobial and vaccine

development.

Unique enzyme and drug targets

Since most of the known antibacterials act as inhibitors of bacterial

enzymes, all bacteria-specific enzymes can be considered potential drug

targets. These enzymes can be identified as potential drug targets. These

enzymes can be identified in organisms based on genome substraction

methods and comprehensive analysis of these resistant proteins for

confirmation. Much more easier and efficient identification is possible by a

similar approach called “Pathway substraction” This approach quickly

identifies enzyme pathways that are specific for bacteria and based on

which drug targets can be easily identified. A typical example is isoprenoid

biosynthesis in lower organisms and higher organisms. Since both these

group uses a completely different enzyme system for the biosynthesis of

this isoprenoid, the enzymes of the pathway are obvious drug targets for

drug design. This has also led to the discovery of fosmidomycin which

binds to the one of the enzyme target in this pathway. The ubiquitin

regulatory pathway, in which ubiquitin is conjugated and deconjugated with

substrate proteins, represents a source of many potential targets for

modulation of cancer and other diseases (Wong et al., 2003).

Page 41: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

41

Membrane transporters as drug targets

Comparative analysis of bacterial genome showed that most of the

pathogenic microbes do not have well developed biosynthetic capabilities

when compared to the free living or its related non-pathogenic forms.

Hence most of the organisms depend on the host completely for their

essential nutrients. A metabolic pathway analysis will reveal substrates that

cannot be produced by their bacterial forms and hence needs to be

transported. This eventually leads to identify bacterial transport protein

which could be an affirmative drug target.

1.9 TARGET PREDICTION METHODS AND STRATEGIES - AN OVERVIEW

1.9.1 Protein interaction network strategy for drug target identification

Proteins are the principal targets of drug discovery. Knowing what

proteins are expressed and how is therefore the first step to generating

value from the knowledge of the human genome. High-throughput

proteomics, identifying potentially hundreds to thousands of protein

expression changes in model systems following perturbation by drug

treatment or disease, lends itself particularly well to target identification in

drug discovery. Protein-protein interaction is the basis of drug target

identification. Protein interaction maps can reveal novel pathways and

functional complexes, allowing ‘guilt by association’ annotation of

uncharacterized proteins. Once the pathways are mapped, these need to

be analyzed and validated functionally in a biological model. It is possible

Page 42: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

42

that other proteins operating in the same pathway as a known drug target

could also represent appropriate drug targets.

Recent analyses of network properties of protein-protein interactions

and of metabolic maps have provided some insights into the structure of

these networks. So identifying protein-protein interactions can provide

insights into the function of important genes, elucidate relevant pathways,

and facilitate the identification of potential drug targets. Powerful

bioinformatics software enables rapid interpretation of protein-protein

interactions, accelerating functional assignment and drug target discovery.

No matter whether the number of actual drug targets is correct or not,

the available data strongly suggest that the present number of known and

well-validated drug targets is still relatively small. Bioinformatics is making

practical contributions in identifying large number of potential drug targets,

however, target validation efforts are required to link them to the aetiology

of known diseases and/or to demonstrate that the novel targets have

relevant therapeutic potential. The biochemical pathways put a drug target

into context: one can chart those in which a target is seen, and thus make

educated guesses about the effects that blocking the target are likely to

have. Further, more complete knowledge of biological pathways should be

used to gain clues for potential target proteins. Despite the promising

results obtained in the different tests carried out by this strategy, there are

several potential problems in applications to drug target identification and

validation. First, it is yet unclear if the currently available genomic

Page 43: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

43

databases, coupled with newly developed computational algorithms, can

offer sufficient information for automated in silico drug target identification.

For improving the biological accuracy of estimated gene networks, o ther

biological information such as sequence information on promoter regions

and protein-protein interactions should be integrated. Secondly, as real

biological processes are often condition specific, and gene expression data

tend to be noisy and often plagued by outliers, it is important to take

“conditions” or “environments” into account. The problem of capturing long -

run network behavior for large-size networks is difficult owing to the

exponential increase of the state spaces. Thirdly, an increasing population

of bioinformatics tools and the lack of an integrated and systematized

interface for their selection and utilization is becoming widely

acknowledged. Last and perhaps more important, understanding how a

target protein works in the context of cellular pathways is rudimentary and

linking diseases in humans to biochemical pathways studied in cells is also

difficult, gene network identification is a really hard problem and modeling a

larger protein complex will be an important challenge. The identificat ion and

validation of drug targets depends critically on knowledge of the

biochemical pathways in which potential target molecules operate within

cells. This requires a restructuring of the classical linear progression from

gene identification, functional elucidation, target validation and screen

development. One of the major goals of pharmaceutical bioinformatics is to

Page 44: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

44

develop computational tools for systematic in silico molecular target

identification.

One of the most important challenges for drug development, however,

is to rapidly identify target proteins most appropriate to further

development. Bioinformatics technology in the past decade has given birth

to the new paradigm of a biology-driven process. There are many exciting

developments to come in the field of target identification. Gene network

technology creates cell and organ-level computer models able to simulate

the clinical performance of drugs and drug candidates. By predicting how

and why specific compounds impact human biology, gene networks

technique may provide a glimpse of the signals and interactions within

regulatory pathways of the cell. In fact, it is now possible to think of the

whole pharmaceutical process as a computational approach, with

confirmatory experiments at each decision-point.

1.10 METHODS FOR DRUG TARGET IDENTIFICATION

The identification of disease relevant phenotypes follows the

identification of novel drug targets that modulate or inhibit these responses.

This can be broadly classified into three approaches

Mechanism- driven approach

Physiological approach

Gene driven approach

Page 45: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

45

1.10.1 Mechanism driven-Determining novel drug targets from network

structures

With the development of bioinformatics, a number of computational

techniques have been used to search for novel drug targets from the

information contained in genomics. The network-based strategy for drug

target identification attempts to reconstruct endogenous metabolic,

regulatory and signaling networks with which potential drug targets interact.

Once having these information provided by gene networks or protein

networks, the interaction relationships between potential drug targets could

be explicitly revealed, so it could be easily determined which one of these

potential drug targets is most proper, or the scope of selecting candidate

drug targets could be narrowed down to a great extent , for example, if a

potential drug target participates in many biological pathways of the

pathogen, the inhibition of this target may interfere with many activities

associated with those pathways, and therefore, may be a good candidate

for drug target.

It involves acquiring a molecular level understanding of the function

of drug targets. On the molecular level, function is manifested in the

behavior of complex networks. It is necessary to know the cellular context

of the drug target and the impact of its inhibition or activation on multiple

signaling pathways. Graphical models are often used to describe genetic

networks. Generally, a gene network could be presented in a directed

graph, in which nodes indicate genes and edges represent regulations

Page 46: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

46

between genes (e.g. activation or suppression). Analyzing the network

structures of large-scale interrogation of cellular processes holds promise

for the identification of essential mediators of s ignal transduction pathways

and potential drug targets. In order to find proper candidate target genes,

one needs biological knowledge of the pathways underlying the disease

process. So the study of biochemical pathways is the focus of numerous

researchers. However, owing to the complexity of pathway structures, many

potential drug targets turned out worthless because the pathways in which

they participate were more complex than expected. A promising strategy is

to examine the functionality of different genes in the network and observe

the connectivity of different functional domains. Some researchers have

implemented this gene network-based strategy for drug target identification.

First, using the gene expression data obtained from expression

experiments of several dose and time responses to the drug, those genes

affected by the drug (drug-affected genes) could be identified by fold-

change analysis or virtual gene technique. Because there is no guarantee

that genes most affected by the drug are the genes that were "drugged" by

the drug agent, nor is there any guarantee that the drugged target

represents the most biologically available and advantageous molecular

target for intervention with new drugs, they further searched the most

proper drug target genes upstream of the drug-affected genes in a

regulatory network. Using gene expression profiles obtained from 120 gene

disruptions, they employed a method based on Bayesian network model to

Page 47: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

47

construct a gene network. Then, by exploring the gene network, they found

the “druggable genes”, namely drug targets regulating the drug -affected

genes most strongly, and a novel drug target gene was identified and

validated.

1.10.2 Gene driven-Gene network strategy for drug target identification

The molecular interactions of genes and gene products underlie

fundamental questions of biology. Genetic interactions are central to the

understanding of molecular structure and function, cellular metabolism, and

response of organisms to their environments. If such interaction patterns

can be measured for various kinds of tissues and the corresponding data

can be interpreted, potential benefits are obvious for the identification of

candidate drug targets. It has already been demonstrated that it is possible

to infer a predictive model of a genetic network by time-series gene

expression data or steady-state gene expression data of gene knockout.

Using the inferred model, useful predictions can be made by mathematical

analysis and computer simulations. Recently several computational

methods have been proposed to reconstruct gene networks, such as

Boolean networks, differential equation models and Bayesian networks.

These quantitative approaches can be applied to natural gene networks

and used to generate a more comprehensive understanding of cellular

regulation, discover the underlying gene regulatory mechanisms and reveal

the interactions between drugs and the drug targets in cells.

Page 48: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

48

1.10.3 Physiological approach- Protein interaction network strategy

for drug target identification

Proteins are the principal targets of drug discovery. Knowing what

proteins are expressed and how is therefore the first step to generating

value from the knowledge of the human genome. Proteomics has unique

and significant advantages as an important complement to a genomics

approach. High-throughput proteomics, identifying potentially hundreds to

thousands of protein expression changes in model systems following

perturbation by drug treatment or disease, lends itself particularly well to

target identification in drug discovery. Protein-protein interaction is the

basis of drug target identification. Protein interaction maps can reveal novel

pathways and functional complexes, allowing ‘guilt by association’

annotation of uncharacterized proteins. Once the pathways are mapped,

these need to be analyzed and validated functionally in a biological model.

It is possible that other proteins operating in the same pathway as a known

drug target could also represent appropriate drug targets. Recent analyses

of network properties of protein-protein interactions and of metabolic maps

have provided some insights into the structure of these networks. So

identifying protein-protein interactions can provide insights into the function

of important genes, elucidate relevant pathways, and facilitate the

identification of potential drug targets. Powerful bioinformatics software

enables rapid interpretation of protein-protein interactions, accelerating

functional assignment and drug target discovery.

Page 49: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

49

No matter whether the number of actual drug targets is correct or not,

the available data strongly suggest that the present number of known and

well-validated drug targets is still relatively small. Bioinformatics is making

practical contributions in identifying large number of potential drug targets;

however, target validation efforts are required to link them to the aetiology

of known diseases and/or to demonstrate that the novel targets have

relevant therapeutic potential. The biochemical pathways put a drug target

into context: one can chart those in which a target is seen, and thus make

educated guesses about the effects that blocking the target are likely to

have. Further, more complete knowledge of biological pathways should be

used to gain clues for potential target proteins. Despite the promising

results obtained in the different tests carried out by this strategy, there are

several potential problems in applications to drug target identification and

validation. First, it is yet unclear if the currently available genomic

databases, coupled with newly developed computational algorithms, can

offer sufficient information for automated in silico drug target identification.

For improving the biological accuracy of estimated gene networks, other

biological information such as sequence information on promoter regions

and protein-protein interactions should be integrated. Secondly, as real

biological processes are often condition specific, and gene expression data

tend to be noisy and often plagued by outliers, it is important to take

“conditions” or “environments” into account. The problem of capturing long -

run network behavior for large-size networks is difficult owing to the

Page 50: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

50

exponential increase of the state spaces. Thirdly, an increasing population

of bioinformatics tools and the lack of an integrated and systematized

interface for their selection and utilization is becoming widely

acknowledged. Last and perhaps more important, understanding how a

target protein works in the context of cellular pathways is rudimentary and

linking diseases in humans to biochemical pathways studied in cells is also

difficult, gene network identification is a really hard problem and modeling a

larger protein complex will be an important challenge. The identification and

validation of drug targets depends critically on knowledge of the

biochemical pathways in which potential target molecules operate within

cells. This requires a restructuring of the classical linear progression from

gene identification, functional elucidation, target validation and screen

development. One of the major goals of pharmaceutical bioinformatics is to

develop computational tools for systematic in silico molecular target

identification.

The advent of genomics offers means to expand the range of targets,

the choice of potential drug targets thrown up by genomics data is

overwhelming. One of the most important challenges for drug development,

however, is to rapidly identify target proteins most appropriate to further

development. Genomics and proteomics technologies have created a

paradigm shift in the drug discovery process. Bioinformatics technology in

the past decade has given birth to the new paradigm of a biology-driven

process. There are many exciting developments to come in the field of

Page 51: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

51

target identification. Gene network technology creates cell and organ-level

computer models able to simulate the clinical performance of drugs and

drug candidates. By predicting how and why specific compounds impact

human biology, gene networks technique may provide a glimpse of the

signals and interactions within regulatory pathways of the cell. In fact, it is

now possible to think of the whole pharmaceutical process as a

computational approach, with confirmatory experiments at each decision-

point.

There are several directions for future research. First, in the near

future, data produced about cellular processes at molecular level will

accumulate with an accelerating rate as a result of genomics studies. In

this regard, it is essential to develop approaches for inferring gene

networks from microarray data and other biological data effectively. The

development of systematic approaches to finding genes for effective

therapeutic intervention requires new models and powerful tools for

understanding complex genetic networks. Secondly, owing to the reason

that integrating the information from different types of networks may lead to

the notion of functional networks and functional modules, to find these

modules, we should consider the general question of the potential effect of

individual genes on the global dynamical network behavior both from the

view of random gene perturbation as well as intervention. It should be

emphasized that although computational tools and resources can be used

to identify putative drug targets, validating targets is still a process that

Page 52: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

52

requires understanding the role of the gene or protein in the disease

process and is heavily dependent on laboratory based work. The new

integrative technological developments in Systems biology, coupled with a

number of ‘omic’ techniques, may lead to a breakthrough for the

identification and validation of important drug targets in the future.

The application of information technology in biological and chemical

sciences has become a critical part of the molecular modelling, drug

designing, database designing. Proteins and nucleic acids that play key

roles in disease processes have been explored as therapeutic targets for

drug development (Drews, 2000). Knowledge of these therapeutically

relevant proteins and nucleic acids has facilitated modern drug discovery

by providing platforms for drug screening against a preselected target. It

has also contributed to the study of the molecular mechanism of drug

actions, discovery of new therapeutic targets and development of drug

design tools. Information about non-target proteins and natural small

molecules involved in these pathways is also useful for facilitating the

search of new therapeutic targets and for understanding how therapeutic

targets interact with other molecules to perform specific tasks. Number of

web-based resources of therapeutically targeted proteins and nucleic acids

are available, which provide useful information about the targets of drugs

and investigational agents.

Antibiotics are among the most frequently prescribed medications in

modern medicine. Antibiotics cure disease by killing or injuring bacteria.

Page 53: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

53

The first antibiotic was penicillin, discovered accidentally from a mold

culture. Today, over 100 different antibiotics are available to doctors to

cure minor discomforts as well as life-threatening infections.

Antibiotics are substances that are produced by molds or bacteria

and that kill or inhibit the growth of other microorganisms. In 1929,

Alexander Flemming, a British scientist who was working with

Staphylococcus, a bacterium that most of us have encountered as it causes

wound infections, discovered the first antibiotic. One day, when he, by

mistake, contaminated his bacterial plate with a mold, he noticed that the

Staphylococcus colonies growing near the contaminating mold looked

strange, as if they were dissolving. He realized that this mold secreted a

substance that killed the bacteria. Since the discovery of this antibiotic

many other antibiotics have been discovered and have made it possible to

cure diseases caused by bacteria such as pneumonia, tuberculosis, and

meningitis, saving the lives of millions of people around the world.

Antibiotics specifically attack bacteria without harming cells belonging

to the organism that produced them. Antibiotics such as penicillin kill

bacteria by inhibiting them from making cell walls that are needed for their

survival. Without their cell wall the contents of the cells leak out and the

cell is destroyed. Human and animal cells do not require a cell wall in order

to survive, thus these antibiotics do not damage them.

The current increase in the number of microbes resistant to

antibacterial or antifungal agents represents a potential crisis in human and

Page 54: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

54

veterinary medicine. Some believe that we are entering a post antibiotic era

where most antibiotics no longer will be efficacious. Therefore, it is

important that new antibiotics be developed. Since bacteria can exchange

DNA with other bacteria (even with distant genera), bacteria can acquire

resistance genes from resistant organisms. However, because of the

potential for cross-resistance, new targets for the discovery of antibiotics

are needed particularly where resistance does not currently exist. Two

major classes of targets can be considered: essential genes and virulence-

based genes.

Bioinformatics has become indispensable to all fields of life sciences.

The rapid progress of genome projects has brought a vast accumulation of

molecular biological information in the past decade. Millions of nucleic acid

sequences with billions of bases have been deposited in EMBL, GenBank

and DDBJ. Hundreds of specialist databases have been derived from the

above primary sequence databases. In the year 2000, people saw the

completion of the genome projects of the fruit fly and the Arabidopsis

thaliana. People also witness the completion of the draft of the Human

Genome Project in the same year. Biology is entering the post genome era

in the new century. A number of approaches for new vaccine development

exist, including sub-unit protein and DNA vaccines; recombinant vaccines;

auxotrophic organisms to deliver genes and so on. Testing such candidates

is tiresome and expensive. Bioinformatics enables us to reduce

substantially the number of such candidates to test. Scanning of bacterial

Page 55: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

55

genomes to identify essential genes is of biological interest, for

understanding the basic functions required for life, and of practical interest,

for the identification of novel targets for new antimicrobial therapies. The

recent availability of the human genome sequence represents a major step

in drug discovery. Knowledge of the human proteome will provide

unprecedented opportunities for studies of human gene function. Often

clues will be provided by sequence similarity with proteins of known

function in model organisms. Such initial observations must then be

followed up by detailed studies to establish the actual function of these

molecules in humans. The spread of antibiotic resistance in bacteria has

intensified the need for novel approaches to antimicrobial drug discovery.

In recent years, we have seen an explosion in the amount of

biological information that is available. Various databases are doubling in

size every 15 months and we now have the complete genome sequences of

more than 100 organisms. It appears that the ability to generate vast

quantities of data has surpassed the ability to use this data meaningfully.

The pharmaceutical industry has embraced genomics as a source of drug

targets. It also recognises that the field of bioinformatics is crucial for

validating these potential drug targets and for determining which ones are

the most suitable for entering the drug development pipeline.

Researchers have a continued need for enhanced and expanded

genomic and proteomic databases and tools to allow for more rapid,

accurate, and predictive target selection and validation. Genomics and

Page 56: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

56

proteomics are now being leveraged into the next phase of the drug

discovery process, which is finding the best drug molecules. Comparative

and functional genomic data can provide fundamental scientific knowledge

with applications in medicine, industry, agriculture and environmental

biomonitoring. These approaches depend on bioinformatics and methods.

The growing use of technologies, such as DNA microarrays and BACs, in

the field of bacterial genomics, has immense potential with respect to

beneficial applications.

Recently, there has been a change in the way that medicines are

being developed due to our increased understanding of molecular biology.

In the past, new synthetic organic molecules were tested in animals or in

whole organ preparations. This has been replaced with a molecular target

approach in which in-vitro screening of compounds against purified,

recombinant proteins or genetically modified cell lines is carried out with a

high throughput. This change has come about as a consequence of better

and ever improving knowledge of the molecular basis of disease.

The availability of whole genomes of many pathogenic bacteria allows

one to speed up the process of drug target selection by finding novel genes

in new and old functional categories previously mentioned. The analysis of

open reading frames of bacterial sequences makes all genes and gene

products as possible drug targets (Smith, 1996). Scientist must therefore

isolate the genes that are essential to cell survival or growth, which would

be most effective as antibiotic targets. Traditionally, new genes that were

Page 57: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

57

necessary to bacterial survival or virulence were discovered through

random mutagenesis and phenotyping of the bacterial genome (Hood,

1999). However, scientists can now use automated comparisons of

bacterial genomes to categorize genes and the proteins encoded. Primary

sequence comparison programs, like BLAST or PSI-BLAST, can determine

gene functions by sequence homology. Sequence homology is also used

to determine clusters of orthologous groups (COGs). COGs are groups of

genes shared by evolutionarily distant organisms. These orthologous

families of genes are prime candidates for broad-spectrum antimicrobial

agents.

1.11 OUR APPROACH

In this current research, we have designed an approach to identify

drug targets from bacterial genome.

The figure-3 represents the steps involved in prediction and validation

of drug targets in microbial genome. The target is predicted by comparing

the bacterial genome with essential genes and then comparing these

predicted essential genes with the human genes/protein to identify non

homologues drug target. Previously subtractive genomics approach was

used (Sakharkar et al., 2004; Anirban Dutta et al., 2006) to identify

potential drug targets in Pseudomonas aeruginosa and Helicobacter pylori.

In the present approach the complete sequence of identification is

Page 58: Introduction to Drug Target Identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

58

automated so that the user can submit the input and get the output as

target sequences.

Fig. 3: Approach - Target prediction and validation

The obtained target sequences were analyzed for its functional role

using sequence analysis tools (BLAST and Pfam). The validation of these

drug targets were done by comparing these against the approved and

proposed genes/proteins from the Drugbank database.

The predicted targets from the selected pathogenic organism’s gene

name, protein product, Enzyme Commission Number, function, functional

information were collected and populated in a web based database to act

as a base for drug discovery process.

______