Ted Baker School of Biological Sciences University of Auckland New Zealand
description
Transcript of Ted Baker School of Biological Sciences University of Auckland New Zealand
New Drug Targets from Mycobacterium tuberculosis: Strategies, Progress and
Pitfalls from a Structural Genomics Enterprise
New Drug Targets from Mycobacterium tuberculosis: Strategies, Progress and
Pitfalls from a Structural Genomics Enterprise
Ted BakerTed Baker
School of Biological SciencesSchool of Biological Sciences
University of AucklandUniversity of Auckland
New ZealandNew Zealand
On behalf of TB StructuralOn behalf of TB Structural
Genomics Consortium Genomics Consortium
The challenge posed by complete genome sequences
The challenge posed by complete genome sequences
The Mycobacterium tuberculosis genome
The Mycobacterium tuberculosis genome
Approx. 3900 open reading frames (ORFs)Approx. 3900 open reading frames (ORFs) ~60% of gene products have an inferred~60% of gene products have an inferred
function (mostly by homology)function (mostly by homology) ~25% are “conserved hypotheticals”~25% are “conserved hypotheticals” ~15% are “unknowns”~15% are “unknowns” ~30% can be related to proteins of known 3D~30% can be related to proteins of known 3D
structure - structure - butbut only ~25 TB protein structures only ~25 TB protein structures Many metabolic pathways appear incompleteMany metabolic pathways appear incomplete
Function from structure? Function from structure?
Relationships that are hidden at the sequence levelRelationships that are hidden at the sequence level
SpeB – virulence factor SpeB – virulence factor from from S. pyogenesS. pyogenes
Actinidin – plant cysteine proteaseActinidin – plant cysteine protease- < 10% sequence identity- < 10% sequence identity
Structural GenomicsStructural Genomics
The use of genomic information to guide protein The use of genomic information to guide protein structure discoverystructure discovery
- - and its inverseand its inverse The use of protein structure analysis to add value The use of protein structure analysis to add value
to genomic sequence data – to deduce functionto genomic sequence data – to deduce function--
Reversal of the ‘traditional’ direction of structural Reversal of the ‘traditional’ direction of structural analysisanalysis
Many targets – whole genomes, pathways,Many targets – whole genomes, pathways,functional classes, foldsfunctional classes, folds
Beginnings…~1998A pilot pilot programme – Pyrobaculum aerophilum
Beginnings…~1998A pilot pilot programme – Pyrobaculum aerophilum
Using laboratory-scale approachesUsing laboratory-scale approaches
- PCR cloning- PCR cloning
- Expression in E. coli, cleavable affinity - Expression in E. coli, cleavable affinity tagstags
- Variation of expression temperature- Variation of expression temperature
- Purification by affinity chromatography and- Purification by affinity chromatography andgel filtrationgel filtration
Genomic approach – most tractable firstGenomic approach – most tractable first
Results – P. aerophilumResults – P. aerophilum
ClonedCloned 25 (274) 25 (274) ExpressedExpressed 20 (168) 20 (168) SolubleSoluble 12 (80) 12 (80) PurifiedPurified 12 (43) 12 (43) CrystallizedCrystallized 6 (24) 6 (24) StructuresStructures 4 (11) 4 (11)
Main bottlenecks Main bottlenecks - solubility - solubility - crystallization- crystallization
Pa_989 (TB homologue)Pa_989 (TB homologue)
HisF (imidazoleglycerol phosphate synthase)HisF (imidazoleglycerol phosphate synthase) Banfield Banfield et al.et al. Acta Cryst. DActa Cryst. D (2001) (2001)
Pa_2307 (unknown)Pa_2307 (unknown)
‘‘Ancient conserved domain’ found in bacteria Ancient conserved domain’ found in bacteria and archaea. No functional annotationand archaea. No functional annotation
Reproducible crystals with LiReproducible crystals with Li22SOSO44
- but twinned- but twinned Two crystals grown from PEG/phosphateTwo crystals grown from PEG/phosphate
1.5 A native data from one, SAD data from 1.5 A native data from one, SAD data from Pt(NOPt(NO22))44 deriv of the other (used gel shift) deriv of the other (used gel shift)
Structure solved: SAD/Solve/Resolve/ARPStructure solved: SAD/Solve/Resolve/ARP
Pa_2307Pa_2307
The next phase – larger enterprises
The next phase – larger enterprises
Publicly fundedPublicly funded
- NIH Protein Structure Initiative (USA)- NIH Protein Structure Initiative (USA)
- Initiatives in Japan, Germany, UK,- Initiatives in Japan, Germany, UK, France, CanadaFrance, Canada
Biotech companiesBiotech companies
- Structural Genomix, Syrrx- Structural Genomix, Syrrx
NIH Protein Structure InitiativeNIH Protein Structure Initiative
10 groups (consortia) funded10 groups (consortia) funded Aim to develop methods and Aim to develop methods and
tools for “high throughput” tools for “high throughput” structure determinationstructure determination
Goals primarily structuralGoals primarily structural - representative structures for- representative structures for all protein sequence familiesall protein sequence families - discover novel folds (cover - discover novel folds (cover
“ “fold space”)fold space”) - estimate 10,000 structures- estimate 10,000 structures
neededneeded
But evolvingBut evolving
Mycobacterium tuberculosis
Causative agent of TB
One-third of world’s population affected - approximately 3 million deaths annually
Five front-line drugs (isoniazid, pyrazinamide, ethambutol, rifampin, streptomycin) but…
- effective only against actively-growing bacteria
- very long treatment regime (6-9 months)- resistance rising - need for new drugs
Peculiarities of the organismPeculiarities of the organism
Very slow-growing Gram-positive organismVery slow-growing Gram-positive organism
Complex waxy cell wall – outer layer richComplex waxy cell wall – outer layer rich in unusual lipids, glycolipids, polysaccharidesin unusual lipids, glycolipids, polysaccharides
Novel biosynthetic pathwaysNovel biosynthetic pathways
Complex lifestyle - Complex lifestyle - persistencepersistence - enters dormant state within- enters dormant state within active macrophagesactive macrophages - survives through switches- survives through switches in metabolism in metabolism - can be reactivated years later- can be reactivated years later
Led in United States by:Led in United States by:- - Tom Terwilliger (Los Alamos NL)Tom Terwilliger (Los Alamos NL)- David Eisenberg (UCLA)- David Eisenberg (UCLA)- Jim Sacchettini (Texas A&M)- Jim Sacchettini (Texas A&M)- Bill Jacobs (Albert Einstein Coll. of Med.)- Bill Jacobs (Albert Einstein Coll. of Med.)- Tom Alber (UC Berkeley)….. - Tom Alber (UC Berkeley)….. and many othersand many others
Aims are focused on function:Aims are focused on function:- understanding TB biology- understanding TB biology- discovery and structural analysis of- discovery and structural analysis of novel drug targetsnovel drug targets
http://www.doe-mbi.ucla.edu/TB/http://www.doe-mbi.ucla.edu/TB/
Philosophy and policies Philosophy and policies
Open participationOpen participation - to all with an interest in TB - to all with an interest in TB Operates as a wider consortium of >30Operates as a wider consortium of >30
participating labs in 13 countries worldwide participating labs in 13 countries worldwide Collaboration between structural biologistsCollaboration between structural biologists
TB biologists, chemists….TB biologists, chemists…. Commitment to common policiesCommitment to common policies
- - collaboration and cooperationcollaboration and cooperation- - shared database for logging progressshared database for logging progress- sharing of data and materials- sharing of data and materials- structures to be placed in public domain- structures to be placed in public domain
Operational aspectsOperational aspects
Central facilities forCentral facilities for- bioinformatic analysis and data storage- bioinformatic analysis and data storage- protein expression and evolution- protein expression and evolution- crystallization- crystallization- synchrotron data collection- synchrotron data collection- gene knockouts- gene knockouts
Technologies and facilities available to allTechnologies and facilities available to all Individuals choose their own targets according toIndividuals choose their own targets according to
their own interests – and assign prioritiestheir own interests – and assign priorities Targeting scores determine priorities of facilitiesTargeting scores determine priorities of facilities Parallel efforts in individual labsParallel efforts in individual labs
Progress to dateProgress to date
Most of structural results to date come as a resultMost of structural results to date come as a result of efforts in individual labs of efforts in individual labs
ButBut - availability of high-throughput facilities gives - availability of high-throughput facilities gives
flexible options for individual labsflexible options for individual labs and for efforts in the facilitiesand for efforts in the facilities
Within facilities – 688 genes cloned (out of 720Within facilities – 688 genes cloned (out of 720 targeted to date)targeted to date)
First phaseFirst phase – concentrate on soluble proteins – concentrate on soluble proteins Next phaseNext phase – the insoluble proteins – the insoluble proteins
Dealing with insoluble proteins GFP fusions as reporter of solubility – G. Waldo
XR
CN L
Non-functional R
Insoluble
Detect function R
Soluble
Express fusion protein X-L-R
• Function of R (GFP) depends on solubility of X-L-R.• Solubility of X-L-R depends on X.
Folding Reporter - GFP
Cell Colonies
In VitroTranscription
+Translation
Soluble Fraction
Pellet Fraction
SDS-PAGEX (Non-Fusion)
X-L-GFP FUSION FLUORESCENCE
Using GFP-fusions to engineer proteins for solubility
Using GFP-fusions to engineer proteins for solubility
Insoluble Protein
FORWARDEVOLUTION
Clone
Select
Mutate Gene
RecombineOptima
Soluble Protein
BACKCROSSING Clone
Select
RecombineOptima &Wild type
G.Waldo
Solubilisation by evolutionRv2002 – Se Won Suh
Solubilisation by evolutionRv2002 – Se Won Suh
Putative ketoacyl ACPPutative ketoacyl ACP reductasereductase
Rendered soluble byRendered soluble by 3 random mutations3 random mutations
I6T and T69KI6T and T69K mutations are onmutations are on the molecular surfacethe molecular surface
V47M mutationV47M mutation enhances a semi-enhances a semi- exposed hydrophobicexposed hydrophobic contactcontact
Potential new TB drug targetsPotential new
TB drug targets
Early results from the TBEarly results from the TB
Structural Genomics ConsortiumStructural Genomics Consortium
Target ORF Selection in Mycobacterium tuberculosis
Target ORF Selection in Mycobacterium tuberculosis
Selection of ORFs: (a) potential drug targetsSelection of ORFs: (a) potential drug targets and (b) to understand TB biologyand (b) to understand TB biology Biosynthetic enzymes for essential aminoBiosynthetic enzymes for essential amino acids, cofactors, lipids, polysaccharidesacids, cofactors, lipids, polysaccharides Secreted proteinsSecreted proteins
Proteins implicated in antibiotic resistanceProteins implicated in antibiotic resistance or responseor response
Proteins implicated in persistenceProteins implicated in persistence
1. Cell wall biosynthesis- mycolic acids (Sacchettini lab)
1. Cell wall biosynthesis- mycolic acids (Sacchettini lab)
Long chain branched lipids - form dense waxy Long chain branched lipids - form dense waxy outer layer of the mycobacterial cell wallouter layer of the mycobacterial cell wall
Contribute to its impenetrabilityContribute to its impenetrability Implicated in both virulence and persistenceImplicated in both virulence and persistence Either covalently attached to cell wallEither covalently attached to cell wall
or released as trehalose dimycolateor released as trehalose dimycolate(“cord factor”)(“cord factor”)
Modification of mycolic acids, eg. cyclopropanationModification of mycolic acids, eg. cyclopropanation – – varies between pathogenic and non-pathogenicvaries between pathogenic and non-pathogenic speciesspecies
Cyclopropanation of mycolic acid chains
Cyclopropanation of mycolic acid chains
Cyclopropane groups introduced by methylation Cyclopropane groups introduced by methylation
Three cyclopropane synthases(C. Smith, J. Sacchettini – Texas A&M)
Three cyclopropane synthases(C. Smith, J. Sacchettini – Texas A&M)
CmaA1 CmaA2
PcaA
2. Secreted proteins(Eisenberg lab)
2. Secreted proteins(Eisenberg lab)
Secreted proteins attractive drug targetsSecreted proteins attractive drug targets for for M. tuberculosisM. tuberculosis because: because:
Often determinants of virulence or persistenceOften determinants of virulence or persistence- involved in cell wall modification- involved in cell wall modification- role in survival in macrophages- role in survival in macrophages
M. tuberculosisM. tuberculosis secretes large number of proteins secretes large number of proteins
Cell wall is impermeable to many anti-Cell wall is impermeable to many anti- bacterial agentsbacterial agents
Secreted proteins(C. Goulding, D. Anderson, H. Gill, D. Eisenberg – UCLA)
Secreted proteins(C. Goulding, D. Anderson, H. Gill, D. Eisenberg – UCLA)
Rv1886cAntigen 85BMycolyl transferase
NC
Rv2220Glutamine synthetase
- Synthesis ofpoly-(L-Glu-L-Gln)for cell wall
Rv1926cUnknown, resembles cell surface binding proteins (invasin, adaptin, arrestin)
3. Targets against persistence(Sacchettini lab)
3. Targets against persistence(Sacchettini lab)
Persistence within activated macrophagesPersistence within activated macrophages facilitated by switch in metabolismfacilitated by switch in metabolism
Glycolysis downregulated – insteadGlycolysis downregulated – instead glyoxalate shuntglyoxalate shunt allows use of C2 substrates allows use of C2 substrates generated by generated by -oxidation of fatty acids-oxidation of fatty acids
Enzymes isocitrate lyase and malate synthaseEnzymes isocitrate lyase and malate synthase are drug targets for persistent bacteriaare drug targets for persistent bacteria
Glyoxalate shunt enzymes(V. Sharma, J. Sacchettini - Texas A&M)
Glyoxalate shunt enzymes(V. Sharma, J. Sacchettini - Texas A&M)
Rv0867 Rv0867 Isocitrate lyaseIsocitrate lyase
Rv1837cRv1837cMalate synthaseMalate synthase
4. Antibiotic resistance- Isoniazid response genes
4. Antibiotic resistance- Isoniazid response genes
DNA microarray analysis DNA microarray analysis of TB ORFs upregulated by of TB ORFs upregulated by exposure to isoniazidexposure to isoniazid
Some code for proteins ofSome code for proteins ofknown function – cell wallknown function – cell wall
biosynthesisbiosynthesis Others represent ‘unknowns’Others represent ‘unknowns’ The proteins encoded byThe proteins encoded by
these ORFs may represent these ORFs may represent the bacterial response to thethe bacterial response to thetoxic effects of the antibiotictoxic effects of the antibiotic
Wilson et al., PNAS 96:12833-12838 (1999)
Putative INH response operonPutative INH response operon
Four ORFs appear to make up part of a Four ORFs appear to make up part of a putative operon in the TB genome: Rv0340, putative operon in the TB genome: Rv0340, Rv0341, Rv0342, Rv0343.Rv0341, Rv0342, Rv0343.
None of the four ORFs have detectable None of the four ORFs have detectable sequence homologues in other organisms.sequence homologues in other organisms.
Rv0340 and Rv0341 are paralogues, as are Rv0340 and Rv0341 are paralogues, as are Rv0342 and Rv0343Rv0342 and Rv0343
Same genes also upregulated by ethambutol.Same genes also upregulated by ethambutol.
Rv0340 Rv0341 Rv0342 Rv0343
Isoniazid response – Rv0340Moyra Komen, Vic Arcus, Shaun Lott
Isoniazid response – Rv0340Moyra Komen, Vic Arcus, Shaun Lott
Crystallization attemptsCrystallization attempts
NMR – shows only partially foldedNMR – shows only partially folded
Limited proteolysis – gives N-terminal fragment Limited proteolysis – gives N-terminal fragment with excellent NMR spectrumwith excellent NMR spectrum
Oil Spherulites
NMR spectrum – Rv0340(residues 1-131)
NMR spectrum – Rv0340(residues 1-131)
Indicates helicalIndicates helicalbundle with flexible bundle with flexible tailtail
Possible homologyPossible homologywith acyl carrierwith acyl carrierproteinprotein
Gives putativeGives putativerole in cell wallrole in cell wallbiosynthesisbiosynthesis
Problems of partial or incorrect functional annotation
Problems of partial or incorrect functional annotation
Widespread in bacteria, butWidespread in bacteria, but not eukaryotesnot eukaryotes
No clearly indicated functionNo clearly indicated function- closest sequence homologs:- closest sequence homologs:
malonyl CoA decarboxylasemalonyl CoA decarboxylase
siderophore biosynthesissiderophore biosynthesis aminoglycoside acetyltransferaseaminoglycoside acetyltransferase
No structure predictionNo structure prediction
Rv1347cRv1347c
Rv1347c structure - Graeme Card
Rv1347c structure - Graeme Card
Rv1347cRv1347c Acetyl-CoA dependent Acetyl-CoA dependent aminoglycosideaminoglycoside
acetyltransferase acetyltransferase (11% identity)(11% identity)
Rv1347cRv1347cAminoglycoside N-acetyl Aminoglycoside N-acetyl transferase (GCN5 family)transferase (GCN5 family)~ 11% sequence identity~ 11% sequence identity
Problem of partial or incorrect functional annotations
Problem of partial or incorrect functional annotations
Putative SAM-dependent methyltransferasePutative SAM-dependent methyltransferasecatalysing final step in menaquinone biosynthesiscatalysing final step in menaquinone biosynthesis
Potential drug target – menaquinone pathway isPotential drug target – menaquinone pathway isessential and is not present in humansessential and is not present in humans
Genome also includes ubiE (Rv0558) - catalyses Genome also includes ubiE (Rv0558) - catalyses this step in both menaquinone and ubiquinone this step in both menaquinone and ubiquinone biosynthesis (menG is specific for menaquinone)biosynthesis (menG is specific for menaquinone)
Expressed, refolded, crystallized, solved to 1.9Expressed, refolded, crystallized, solved to 1.9ÅÅ by by SIRASSIRAS
Rv3853 - “menG”Rv3853 - “menG”
Common methyltransferase foldCommon methyltransferase fold
MenG structure – Jodie JohnstonMenG structure – Jodie Johnston
Structure does not Structure does not look like a look like a methyltransferasemethyltransferase
Resembles a Resembles a phosphate transfer phosphate transfer domain?domain?
Incorrect annotationIncorrect annotation
Challenges for the futureChallenges for the future
Membrane proteinsMembrane proteins Solubility of expressed proteinsSolubility of expressed proteins Hetero-oligomeric proteinsHetero-oligomeric proteins Protein-protein interactionsProtein-protein interactions Assignment of function to “unknowns”Assignment of function to “unknowns” Cellular pathways - metabolic pathwaysCellular pathways - metabolic pathways
- signalling pathways- signalling pathways
ConclusionsConclusions
Structural biology is being transformed byStructural biology is being transformed by new technologies – some driven by genomicsnew technologies – some driven by genomics
Less effort in solving initial structures – more Less effort in solving initial structures – more emphasis on “downstream” studiesemphasis on “downstream” studies
TB structural genomics consortium – a differentTB structural genomics consortium – a different model for large scale structure determinationmodel for large scale structure determination - access to centralised facilities- access to centralised facilities - international effort on a common goal- international effort on a common goal - collaboration rather than competition- collaboration rather than competition - opportunities for smaller labs- opportunities for smaller labs
ThanksThanks
Mycobacterium tuberculosisMycobacterium tuberculosis structural genomics structural genomics consortiumconsortium
Members of Auckland Structural Biology Members of Auckland Structural Biology LaboratoryLaboratory – Vic Arcus, Kristina Backbro, Mark – Vic Arcus, Kristina Backbro, Mark Banfield, Heather Baker, Graeme Card, Jodie Johnston, Banfield, Heather Baker, Graeme Card, Jodie Johnston, Rainer Knijff, Moyra Komen, Shaun Lott, Andrew Rainer Knijff, Moyra Komen, Shaun Lott, Andrew McCarthy, Clyde SmithMcCarthy, Clyde Smith
Marsden FundMarsden Fund Health Research CouncilHealth Research Council New Economy Research FundNew Economy Research Fund