John Weinstein discusses information-intensive approaches to cancer drug discovery

4
DDT Vol. 6, No. 22 November 2001 update feature 1145 What made you decide to work in cancer research? I think most of us have friends and family who have had cancer. I originally came to the NIH to study another abstract subject – the lipophysical chemistry of membranes – but increasingly became interested in significant medical problems, including cancer. Can you tell me about the work you are doing at the National Cancer Institute? I am working closely with the Developmental Therapeutics Program at the National Cancer Institute (NCI) to integrate several buzzword disciplines – genomics, proteomics, bioinformatics and chemoinformatics – in the context of drug discovery for cancer. Since 1990, the Developmental Therapeutics Program has been screening drugs for anticancer activity in 60 different cancer cell lines in culture. The patterns of activity of the compounds against these 60 cell lines are rich in information about mechanisms of drug action and resistance. Informatic analysis of that system was initiated by Kenneth Paull, who tragically died three years ago, and we are trying our best to follow in his footsteps. The patterns of drug activity are useful particularly because we can map them into molecular structures of the compounds tested and also into molecular characteristics of the cells used for testing. Our principal enterprise is to characterize those cells comprehensively at the DNA, RNA and protein levels, that is, ‘omically’. Have you identified any striking correlations between drug-type and the gene-expression profiles in the cell lines studied? Yes, we have defined quite a few. One that is particularly interesting relates to the gene for asparagine synthetase and the drug asparagine, a bacterial enzyme that has been used since the early 1970s to treat lymphocytic leukaemia. We found there is a strong negative correlation between expression of asparagine synthetase in leukaemia cells and the activity of asparagine against those cells. This observation prompted us to ask whether the same would be true for any other cancer cell type. We found that it was true for ovarian cancer cell lines. This correlation suggested the possibility (but it is only a possibility) that a subset of ovarian cancer patients whose tumors express only low levels of asparagine synthetase might be candidates for therapy with asparaginase. This possibility is being followed up in a collaboration with Daniel von Hoff, Director of the Arizona Cancer Center (Tucson, AZ, USA). Have you found any surprising correlations? The correlation involving ovarian cancer that I just mentioned was a surprise. There have been other unexpected findings at the basic scientific level, but we are still faced with the fact that we are dealing with hundreds of thousands of individual relationships between drugs and genes. The challenge is to figure out which are causally interesting, which are epiphenomenal and which are statistical coincidence. To meet this challenge we make use of the literature – which is based largely on the efforts of those who study one gene, protein or process at a time. To facilitate searching of the literature on gene–gene and gene–drug relationships, we have developed a program called MedMiner, which is freely available on our website, (http://www.discover.nci.nih.com). Those who use it say it speeds up the processes of searching and organizing literature by 5–10-fold. What are the limitations of your research approach, and what experiments will you do to overcome these? The first obvious limitation is that these are cells in culture, and it is well known that cells in culture are not truly representative of cancer cells in vivo. Even primary cultured cancer cells have been extracted from their society of cytokines and other cells in the body. So, when we project towards clinical materials we do so with humility and with the understanding that we are developing clues to formulate hypotheses to carry to validation. It is worth remembering, however, that most of our knowledge of molecular biology and molecular pharmacology has come, historically, from studying cultured cells rather than clinical samples. What experiments could you undertake to validate your findings? There are several types of experiment and several meanings of validation. If we are trying to validate basic results from gene expression arrays, for instance, we can resequence clones used in making the arrays and also use a very different technique, such as real-time RT-PCR, to verify the expression levels obtained from the arrays. We are also very directly comparing cDNA arrays and oligonucleotide arrays for the same set of 60 cell lines. Testing of hypotheses about clinical relationships must be done with clinical materials. This we are doing, in collaboration with Mark Raffeld of the NCI Laboratory of Pathology, using tissue arrays from the Tissue Array Resource Program (TARP) at the NCI. 1359-6446/01/$ – see front matter ©2001 Elsevier Science Ltd. All rights reserved. PII: S1359-6446(01)02046-3 John Weinstein discusses information-intensive approaches to cancer drug discovery Interviewed by Joanna Owens John N. Weinstein, Senior Research Investigator, National Cancer Institute www.drugdiscoverytoday.com Those with major professional expertise in informatics have to be thought of as rock stars. The challenge is to figure out which [relationships] are causally interesting, which are epiphenomenal and which are statistical coincidence.

Transcript of John Weinstein discusses information-intensive approaches to cancer drug discovery

Page 1: John Weinstein discusses information-intensive approaches to cancer drug discovery

DDT Vol. 6, No. 22 November 2001 updatefeature

1145

What made you decide to work in cancerresearch?I think most of us have friends and familywho have had cancer. I originally came tothe NIH to study another abstract subject – the lipophysical chemistry ofmembranes – but increasingly becameinterested in significant medical problems,including cancer.

Can you tell me about the work you aredoing at the National Cancer Institute?I am working closely with theDevelopmental Therapeutics Program atthe National Cancer Institute (NCI) tointegrate several buzzword disciplines –genomics, proteomics, bioinformatics andchemoinformatics – in the context of drugdiscovery for cancer. Since 1990, theDevelopmental Therapeutics Program hasbeen screening drugs for anticanceractivity in 60 different cancer cell lines inculture. The patterns of activity of thecompounds against these 60 cell lines arerich in information about mechanisms ofdrug action and resistance. Informaticanalysis of that system was initiated byKenneth Paull, who tragically died threeyears ago, and we are trying our best tofollow in his footsteps. The patterns ofdrug activity are useful particularly becausewe can map them into molecularstructures of the compounds tested andalso into molecular characteristics of thecells used for testing. Our principalenterprise is to characterize those cellscomprehensively at the DNA, RNA andprotein levels, that is, ‘omically’.

Have you identified any striking correlationsbetween drug-type and the gene-expressionprofiles in the cell lines studied?Yes, we have defined quite a few. One thatis particularly interesting relates to the

gene for asparagine synthetase and thedrug asparagine, a bacterial enzyme thathas been used since the early 1970s totreat lymphocytic leukaemia. We foundthere is a strong negative correlationbetween expression of asparaginesynthetase in leukaemia cells and theactivity of asparagine against those cells.This observation prompted us to askwhether the same would be true for anyother cancer cell type. We found that itwas true for ovarian cancer cell lines. Thiscorrelation suggested the possibility (but itis only a possibility) that a subset of ovariancancer patients whose tumors express onlylow levels of asparagine synthetase mightbe candidates for therapy withasparaginase. This possibility is beingfollowed up in a collaboration with Danielvon Hoff, Director of the Arizona CancerCenter (Tucson, AZ, USA).

Have you found any surprisingcorrelations?The correlation involving ovarian cancerthat I just mentioned was a surprise. Therehave been other unexpected findings atthe basic scientific level, but we are stillfaced with the fact that we are dealingwith hundreds of thousands of individualrelationships between drugs and genes.The challenge is to figure out which arecausally interesting, which areepiphenomenal and which are statistical

coincidence. To meet this challenge wemake use of the literature – which is basedlargely on the efforts of those who studyone gene, protein or process at a time. Tofacilitate searching of the literature ongene–gene and gene–drug relationships,we have developed a program calledMedMiner, which is freely available on ourwebsite, (http://www.discover.nci.nih.com).Those who use it say it speeds up theprocesses of searching and organizingliterature by 5–10-fold.

What are the limitations of your researchapproach, and what experiments will youdo to overcome these?The first obvious limitation is that these arecells in culture, and it is well known thatcells in culture are not truly representativeof cancer cells in vivo. Even primarycultured cancer cells have been extractedfrom their society of cytokines and othercells in the body. So, when we projecttowards clinical materials we do so withhumility and with the understanding thatwe are developing clues to formulatehypotheses to carry to validation. It isworth remembering, however, that mostof our knowledge of molecular biology and molecular pharmacology has come,historically, from studying cultured cellsrather than clinical samples.

What experiments could you undertake tovalidate your findings?There are several types of experiment andseveral meanings of validation. If we aretrying to validate basic results from geneexpression arrays, for instance, we canresequence clones used in making the arraysand also use a very different technique, suchas real-time RT-PCR, to verify theexpression levels obtained from the arrays.We are also very directly comparing cDNAarrays and oligonucleotide arrays for thesame set of 60 cell lines. Testing ofhypotheses about clinical relationshipsmust be done with clinical materials. This we are doing, in collaboration with Mark Raffeld of the NCI Laboratory of Pathology, using tissue arrays from theTissue Array Resource Program (TARP) atthe NCI.

1359-6446/01/$ – see front matter ©2001 Elsevier Science Ltd. All rights reserved. PII: S1359-6446(01)02046-3

John Weinstein discussesinformation-intensiveapproaches to cancer drugdiscovery

Interviewed by Joanna Owens

John N. Weinstein, Senior Research Investigator,National Cancer Institute

www.drugdiscoverytoday.com

Those with major professionalexpertise in informatics have tobe thought of as rock stars.

The challenge is to figure outwhich [relationships] arecausally interesting, which areepiphenomenal and which arestatistical coincidence.

Page 2: John Weinstein discusses information-intensive approaches to cancer drug discovery

update feature DDT Vol. 6, No. 22 November 2001

1146

How will you go from identifying arelationship between drug sensitivity andexpression of a gene, to using this in cancerdrug discovery?Generically, the aim of pharmacogenomicsis to define subpopulations on the basis oftheir gene expression, or their gene orprotein pattern, to lead to candidates for aparticular therapy. Once an association isnoted it will suggest either looking overexisting clinical data to see if thecorrelation holds up there, or else,planning prospectively to treat patients.

So is it possible that your work could leadto the screening of patients for certaingenes or markers that could providepersonalized treatment options?Yes, it is a possibility. For example, we areworking with Mark Raffeld to identifydiagnostic markers of metastases ofunknown origin. Such markers could haveimplications for therapy.

Could your approach to gene-expressionprofiling lead to better and earlier screensfor cancer?I would say an emphatic ‘I don’t know’. Itis the kind of thing that can arise fromresearch into patterns of expression, butwe are not explicitly looking at thedifferences between normal tissues andcancer cells. Many other laboratories aredoing that, and we had to decide early onwhether we join in with that definitelyimportant endeavour or to focus ondifferent, more adventurous projects.

How will the completion of the first draft ofthe human genome sequence impact on yourwork, and on cancer research in general?It will have a major technical benefit for usbecause our ability to identify particularcDNA clones of genes is still incomplete.Although we rely on UniGene databaseclusterings, we can increasingly cross-checkour findings against the developinggenome-sequence information. Once wehave the full sequence of the genome thescientific community will have a mucheasier time trying to establish functionalrelationships, find whole gene sequencesand assign genes to particularchromosomal locations.

What do you think are the biggest hurdlesfor cancer drug discovery in the post-genomic era?Let me cite just one. If you ask scientists atpharma and biotech companies it is clear

that large numbers of targets for cancertherapy have been identified, but perhapsare not the right targets. Everyone ishaving difficulty figuring out which targetswill be most useful.

Which new technologies do you think havebeen key to the development of cancertherapies? Which technologies still need tobe developed?Clearly gene sequencing technologies have been seminal, as have those for geneexpression profiling. And there’s PCR, ofcourse.

As for technologies still to be developed,the first that I would like to see – whichmay or may not be feasible – is a way ofamplifying proteins in the same way thatwe amplify nucleic acids; that is, PCR forproteins. The closest methods we have atthe moment are those based on in vitrotranslation. Second, and an extension ofthat idea, is that we have perhaps 500,000interesting protein states making up the‘wiring diagram’ of the cell, and yet wehave no technology capable of analyzingtheir expression levels comprehensively.This is what we need. There are continuinginnovations in such technologies as massspectrometry and microfluidics, but stillthey can’t begin to approach acomprehensive characterization of thewhole cell.

Do you think that there needs to be moreintegration of chemoinformatics andbioinformatics for optimal use of genomicand chemical data? Will they eventuallybecome one science?My laboratory has been trying for years tomerge chemoinformatics and bioinformatics.Currently, we’re working with Paul Blowerfrom Leadscope (Columbus, OH, USA) onalgorithms and software directed towards

that goal. Although the statistical andanalytical issues in the two areas are quitesimilar, the basic subject matters andexpertise of biologists and chemists aredifferent.

Do you think there is a need for moretraining of people in informatics expertise?Yes, there is definitely a need for moretraining. This problem has been a majorsubject of discussions both at the NCI andelsewhere. In my view there are differentflavours of bioinformatics. I woulddifferentiate between appliedbioinformatics, which is principally theprovince of those trained in biology, anddevelopmental bioinformatics, which isprincipally the province of those trained instatistics, artificial intelligence andcomputer science. The former are usingthe tools (the databases and methods ofanalyses); the latter are developing them.Currently, large numbers of students withbiological backgrounds want to get intobioinformatics; fewer computer scientistsare available to carry out thedevelopmental work and to guide theinterpretation of the data. So that is aproblem. One of the arguments I’ve madeis that those with major professionalexpertise in developmental informaticshave to be thought of as rock stars. It’s notup to us to decide whether they are worthhigher salaries or higher stipends – eitherwe pay or we don’t play.

Who do you think should be responsible forinformatics training: academia or industry?The majority of companies will do whatthey see as necessary for their ownoperation and, clearly, informatics is amajor focus of their recruiting. At theAmerican Statistical Association JointStatistical Meeting (5–9 August 2001,Atlanta, GA, USA), many companies wereinterviewing hundreds of potentialcandidates with statistical backgrounds.Academia has, historically, been lessorganized and proactive about it, but thesituation is improving.

Do you think that, in the long-term,research will focus more on thedevelopment of diagnostics and prevention,rather than drugs to treat cancer?I hope that we get to a point at whichtreatment of cancer is unnecessary becausewe have sufficiently early diagnosis and theability to prevent cancer. But, at this point,

www.drugdiscoverytoday.com

There is definitely a need formore training in informatics.

What will really be theequivalent of the HumanGenome Project is the completewiring diagram of therelationship of the proteins in amammalian cell.

Page 3: John Weinstein discusses information-intensive approaches to cancer drug discovery

DDT Vol. 6, No. 22 November 2001 updatefeature

1147

the research has to go on at all three levels,because we really don’t know at this pointwhich is going to prevent the mostsuffering and death.

The completion of the Human GenomeProject has been a tremendous milestone inlife sciences research. What do you predictwill be the next similar milestone?That’s an interesting question. Anequivalent of the Human Genome Project’sproduct – but much harder to achieve – isthe complete wiring diagram of proteinstates in a mammalian cell. There areseveral-100,000 such states, it’s going totake a long time. Still, 15 years ago whenthe Human Genome Project was facedwith three-billion base pairs, sequencingthe genome seemed rather a far-off goal.In that case, the goal was well-defined, anda public–private enterprise developed the

accelerated technologies to make ithappen. It will be interesting to see howsoon the same sort of focus, public and/orprivate, on the wiring diagram will makewhat is currently a pipe-dream into reality.

What do you think is the greatestunanswered scientific question?Not cancer. In the 19th century it was theorigin of the species. In the 20th century itwas the molecular nature of life. In the21st century it is the working of thehuman mind. Never in history has it beensuch an exciting privilege to be a scientist.

What has been your greatest achievement?My greatest preoccupation – perhapsachievement – has been to indicate, inprototype, ways in which information onDNA, RNA and protein levels in our cellscan be coupled with information on thepharmacology of potentially therapeuticcompounds in the service of drugdiscovery for cancer.

What is your ultimate career goal?My ultimate career goal is to cure cancer.That is my first goal; my second is to helptrain the generation who will succeed afterI have failed.

National Cancer InstituteNational Institutes of Health

9000 Rockville PikeBethesda

MD 20892, USA

EDITOR’S CHOICE

bmn.com/genomics

As a busy scientist, searching through the wealth of information on BioMedNet can be a bit daunting –

the new gateway to genomics on BioMedNet is designed to help.

The new genomics gateway is updated weekly and features relevant articles selected by the editorial

teams from Drug Discovery Today, Trends in Biotechnology and Current Opinion in Biotechnology.

The regular updates include:

News – our dedicated team of reporters from BioMedNet News provide a busy researcher with all the

news to keep up-to-date on what’s happening – right now.

Journal scan – learn about new reports and events in genomics every day, at a glance, without leafing

through stacks of journals.

Conference reporter – daily updates on the most exciting developments revealed at key conferences in

the life sciences – providing a quick but comprehensive report of what you missed by staying home.

Minireviews and Reviews – a selection of the best review and opinion articles from all Trends and

Current Opinion journals and Drug Discovery Today.

Why not bookmark the gateway at http://bmn.com/genomics for access to all the news, reviews

and informed opinion on the latest scientific advances in genomics.

www.drugdiscoverytoday.com

I hope that we get to a point at which treatment of cancer isunnecessary.

Page 4: John Weinstein discusses information-intensive approaches to cancer drug discovery

update discussion forum DDT Vol. 6, No. 22 November 2001

1359-6446/01/$ – see front matter ©2001 Elsevier Science Ltd. All rights reserved. PII: S1359-6446(01)02048-71148

The commercial use ofstructural genomics ▼▼

There has been a good deal of debaterecently about various structuralgenomics initiatives. It is becoming clearthat the goals of these initiatives varydepending upon the nature of theorganization(s) involved. From anacademic perspective, structuralgenomics is seen as a way of obtainingthe structure of as many novel proteinsas possible to fill out further structural-fold space and enable better predictivemethods. This is a major goal of thepublic initiatives funded by the NationalInstitute of General Medical Sciences(Bethesda, MD, USA). For such publicdomain efforts, it does not really matterwhether the proteins are from eukaryoticor prokaryotic sources and pragmatismis generally used to select the proteins.

From a commercial perspective theimperatives are somewhat different.There is an implicit need to concentrateon discovering the structures of proteinsinvolved in specific pathways, or familiesof proteins believed to be important fordrug discovery (for example, proteinsinvolved in bacterial cell-wallbiosynthesis, or for eukaryotic cells, thevarious protein kinases and so on). Onceobtained, new crystal structures are usedas templates for drug discovery and

design by looking at how compoundsbind to the proteins and using structuraldata to guide modification of thecompounds for optimized potency,selectivity and, in some cases, reducedtoxicity. Several biotechnologycompanies engaged in structuralgenomics are building or accessing themeans to embark upon programs ofstructure-enhanced drug discovery.High-performance protein-modelingsoftware can be used to augmentstructural genomics efforts. Importantly,computational predictions must besubjected to the rigors of experimentaltesting and methods. There is a greatdeal of power in being able to check theproducts of virtual screening by lookingat the crystal structures of complexesbetween the selected compound andthe protein. It is this integration ofcomputational and experimentaltechnologies that will be key in fullydeveloping the value of structuralgenomics for drug discovery.

Access to core technologies thatenable structural genomics initiatives isanything but trivial. In the first place,suitable bioinformatics capabilities haveto be available to enable selection of themost appealing targets and to choosethe most appropriate constructs tomake. Most people are now doingstructural genomics by using X-raycrystallography as the preferred method

for structure determination. Today, NMRis simply not competitive from a high-throughput perspective: 85% of thosestructures in the Protein Data Bank1

(PDB; http://rcsb.org/pdb) come fromX-ray structure determination.

Crystallography involves setting uprapid and automated crystallizationtrials. Most structural genomics effortsare using Multiwavelength AnomalousDiffraction (MAD) phasing forcrystallography, a technique pioneeredby Wayne Hendrickson in whichselenomethionine is incorporated intothe proteins to enable rapid data-interpretation2. Although it is possible touse MAD phasing for data collected atmost synchrotron radiation sources, it isbecoming widely acknowledged that thebest data are being collected from third-generation sources of X-rays, such as theAdvanced Photo Source in Chicago (IL, USA), because of the speed andefficiency with which they are obtained.It is now almost routine to collect dataat the APS and use automatedinterpretation methods to go fromdiffraction pattern to refined novelstructures in less than eight hours.Molecular replacement methods areeven quicker. The commercial structural-genomics efforts will enable theprovision of three-dimensional data onthe way compounds bind to theircognate targets to medicinal chemist inreal time. This will be important forincreasing the efficiency of leaddiscovery and optimization.

References1 Berman, H.M. et al. (2000) The Protein Data

Bank. Nucleic Acids Res. 28, 235–2422 Hendrickson, W.A. (2000) Synchrotron

chromatography. Trends Biochem. Sci. 25,637–643

Tim HarrisStructural GenomiX

11099 N. Torrey Pines RdSuite 160

La JollaCA 92037

USA

The Discussion Forum provides a medium for airing your views on any issues

related to the pharmaceutical industry and obtaining feedback and

discussion on these views from others in the field. You can discuss issues

that get you hot under the collar, practical problems at the bench, recently

published literature, or just something bizarre or humorous that you

wish to share. Publication of letters in this section is subject to editorial

discretion and company-promotional letters will be rejected immediately.

Furthermore, the views provided are those of the authors and are not

intended to represent the views of the companies they work for. Moreover,

these views do not reflect those of Elsevier, Drug Discovery Today or its editorial

team. Please submit all letters to Rebecca Lawrence, News & Features Editor,

Drug Discovery Today, e-mail: [email protected]

www.drugdiscoverytoday.com