Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

9
RESEARCH LETTER Proteomic analysis of the insoluble subproteome of Clostridium di⁄cile strain 630 Shailesh Jain, Robert L.J. Graham, Geoff McMullan & Nigel G. Ternan School of Biomedical Sciences, University of Ulster, Coleraine, Co. Londonderry, Northern Ireland Correspondence: Nigel G. Ternan, School of Biomedical Sciences, University of Ulster, Cromore Road, Coleraine, Co. Londonderry BT52 1SA, Northern Ireland. Tel.: 144 28 7032 3063; fax: 144 28 7032 4965; e-mail: [email protected] Present address: Robert L.J. Graham, The Proteome Exploration Laboratory, California Institute of Technology, Beckman Institute, Pasadena, CA 91125, USA. Received 2 July 2010; revised 20 August 2010; accepted 27 August 2010. Final version published online 24 September 2010. DOI:10.1111/j.1574-6968.2010.02111.x Editor: Andr ´ e Klier Keywords multidimensional; proteomics; GeLC/MS; membrane associated; leucine. Abstract Clostridium difficile, a Gram-positive spore-forming anaerobe, causes infections in humans ranging from mild diarrhoeal to potentially life-threatening pseudomem- branous colitis. The availability of genomic information for a range of C. difficile strains affords researchers the opportunity to better understand not only the evolution of these organisms but also their basic physiology and biochemistry. We used proteomics to characterize the insoluble subproteome of C. difficile strain 630. Gel-based LC-MS analysis led to the identification of 2298 peptides; PROVALT analysis with a false discovery rate set at 1% concatenated this list to 560 unique peptides, resulting in 107 proteins being positively identified. These were func- tionally classified and physiochemically characterized and pathway reconstruction identified a variety of central anaerobic metabolic pathways, including glycolysis, mixed acid fermentation and short-chain fatty acid metabolism. Additionally, the metabolism of a variety of amino acids was apparent, including the reductive branch of the leucine fermentation pathway, from which we identified seven of the eight enzymes. Increasing proteomics data sets should – in conjunction with other ‘omic’ technologies – allow the construction of models for ‘normal’ metabolism in C. difficile 630. This would be a significant initial step towards a full systems understanding of this clinically important microorganism. Introduction The Gram-positive spore-forming anaerobe Clostridium difficile, first described by Hall & O’Toole (1935), has become recognized as the leading cause of infectious diar- rhoeal in hospital patients worldwide over the last three decades (Riley, 1998; Sebaihia et al., 2007). Two factors are significant in the increased prevalence of C. difficile infection (CDI): the increase in the use of broad-spectrum antibiotics, including cephalosporins and aminopenicillins (Poutanen & Simor, 2004), and the widely reported contamination of the hospital environment by C. difficile spores (Durai, 2007). Antibiotic-associated diarrhoeal and colitis were well estab- lished soon after antibiotics became available, with C. difficile being identified as the major cause of antibiotic-associated diarrhoeal and as the nearly exclusive cause of potentially life- threatening pseudomembranous colitis in 1978 (Bartlett, 2006). Clostridium difficile’s well-documented antibiotic resis- tance results in its persistence when the normal gut microbial communities are disturbed or eradicated by antibiotic therapy, following which C. difficile spores germinate, producing vegetative cells, which, upon proliferation, secrete the organ- ism’s two major virulence factors – toxin A and toxin B. As the major virulence factors, the toxins have been studied exten- sively in order to dissect C. difficile virulence mechanisms and they are the primary markers for the diagnosis of CDI (reviewed extensively elsewhere – e.g. Voth & Ballard, 2005; Jank et al., 2007; Lyras et al ., 2009). The toxins lead to the development of symptoms associated with CDI, ranging from mild, self-limiting watery diarrhoeal, to mucosal inflamma- tion, high fever and pseudomembranous colitis (Bartlett & Gerding, 2008). Recently, a new epidemic of C. difficile, associated with the emergence of a single hypervirulent strain of C. difficile characterized as toxinotype III, North American pulsed- field gel electrophoresis type 1 (NAP1), restriction-endonu- clease analysis group type BI and PCR-ribotype 027 (Pepin et al., 2005; Green et al., 2007; Marcos & DuPont, 2007), has FEMS Microbiol Lett 312 (2010) 151–159 c 2010 Federation of European Microbiological Societies Published by Blackwell Publishing Ltd. All rights reserved MICROBIOLOGY LETTERS

Transcript of Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Page 1: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

R E S E A R C H L E T T E R

Proteomic analysis ofthe insoluble subproteomeofClostridiumdi⁄cile strain630Shailesh Jain, Robert L.J. Graham, Geoff McMullan & Nigel G. Ternan

School of Biomedical Sciences, University of Ulster, Coleraine, Co. Londonderry, Northern Ireland

Correspondence: Nigel G. Ternan, School of

Biomedical Sciences, University of Ulster,

Cromore Road, Coleraine, Co. Londonderry

BT52 1SA, Northern Ireland. Tel.: 144 28

7032 3063; fax: 144 28 7032 4965; e-mail:

[email protected]

Present address: Robert L.J. Graham, The

Proteome Exploration Laboratory, California

Institute of Technology, Beckman Institute,

Pasadena, CA 91125, USA.

Received 2 July 2010; revised 20 August 2010;

accepted 27 August 2010.

Final version published online 24 September

2010.

DOI:10.1111/j.1574-6968.2010.02111.x

Editor: Andre Klier

Keywords

multidimensional; proteomics; GeLC/MS;

membrane associated; leucine.

Abstract

Clostridium difficile, a Gram-positive spore-forming anaerobe, causes infections in

humans ranging from mild diarrhoeal to potentially life-threatening pseudomem-

branous colitis. The availability of genomic information for a range of C. difficile

strains affords researchers the opportunity to better understand not only the

evolution of these organisms but also their basic physiology and biochemistry. We

used proteomics to characterize the insoluble subproteome of C. difficile strain

630. Gel-based LC-MS analysis led to the identification of 2298 peptides; PROVALT

analysis with a false discovery rate set at 1% concatenated this list to 560 unique

peptides, resulting in 107 proteins being positively identified. These were func-

tionally classified and physiochemically characterized and pathway reconstruction

identified a variety of central anaerobic metabolic pathways, including glycolysis,

mixed acid fermentation and short-chain fatty acid metabolism. Additionally, the

metabolism of a variety of amino acids was apparent, including the reductive

branch of the leucine fermentation pathway, from which we identified seven of the

eight enzymes. Increasing proteomics data sets should – in conjunction with other

‘omic’ technologies – allow the construction of models for ‘normal’ metabolism in

C. difficile 630. This would be a significant initial step towards a full systems

understanding of this clinically important microorganism.

Introduction

The Gram-positive spore-forming anaerobe Clostridium

difficile, first described by Hall & O’Toole (1935), has

become recognized as the leading cause of infectious diar-

rhoeal in hospital patients worldwide over the last three

decades (Riley, 1998; Sebaihia et al., 2007). Two factors are

significant in the increased prevalence of C. difficile infection

(CDI): the increase in the use of broad-spectrum antibiotics,

including cephalosporins and aminopenicillins (Poutanen &

Simor, 2004), and the widely reported contamination of the

hospital environment by C. difficile spores (Durai, 2007).

Antibiotic-associated diarrhoeal and colitis were well estab-

lished soon after antibiotics became available, with C. difficile

being identified as the major cause of antibiotic-associated

diarrhoeal and as the nearly exclusive cause of potentially life-

threatening pseudomembranous colitis in 1978 (Bartlett,

2006). Clostridium difficile’s well-documented antibiotic resis-

tance results in its persistence when the normal gut microbial

communities are disturbed or eradicated by antibiotic therapy,

following which C. difficile spores germinate, producing

vegetative cells, which, upon proliferation, secrete the organ-

ism’s two major virulence factors – toxin A and toxin B. As the

major virulence factors, the toxins have been studied exten-

sively in order to dissect C. difficile virulence mechanisms and

they are the primary markers for the diagnosis of CDI

(reviewed extensively elsewhere – e.g. Voth & Ballard, 2005;

Jank et al., 2007; Lyras et al., 2009). The toxins lead to the

development of symptoms associated with CDI, ranging from

mild, self-limiting watery diarrhoeal, to mucosal inflamma-

tion, high fever and pseudomembranous colitis (Bartlett &

Gerding, 2008).

Recently, a new epidemic of C. difficile, associated with

the emergence of a single hypervirulent strain of C. difficile

characterized as toxinotype III, North American pulsed-

field gel electrophoresis type 1 (NAP1), restriction-endonu-

clease analysis group type BI and PCR-ribotype 027 (Pepin

et al., 2005; Green et al., 2007; Marcos & DuPont, 2007), has

FEMS Microbiol Lett 312 (2010) 151–159 c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

MIC

ROBI

OLO

GY

LET

TER

S

Page 2: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

come to light. The strain carries the binary toxin gene CdtB,

and has an 18-base-pair deletion in the toxin repressor gene,

tcdC, which means that it generates approximately 16–23

times more toxin than other strains (Warny et al., 2005).

Infection is associated with a high risk of acute clinical

deterioration and a poor response to metronidazole therapy

(Spigaglia & Mastrantonio, 2002; Pepin et al., 2005), making it

a major concern for healthcare worldwide. Clostridium difficile

ribotype 027 was initially rare in the United Kingdom;

however, when outbreaks at Stoke Mandeville and the Royal

Devon and Exeter Hospitals were investigated in 2004–2005,

type 027 was found to predominate in their cases (Anon,

2006), and this ribotype has now been detected in the majority

of countries around the world (Kuijper et al., 2007).

It is clear, then, that C. difficile is a significant burden on

the healthcare profession and patients. With the ever-

increasing availability of genomic information, however,

greater insight into the evolution and variation of C. difficile

genomes is now possible (Stabler et al., 2006, 2009; He et al.,

2010). The Clostridb database (http://xbase.bham.ac.uk/

clostridb/) (Chaudhuri & Pallen, 2006), an excellent publicly

accessible resource for those interested in comparative

genomics of the genus Clostridium, currently contains

genome sequences of 18 strains of clostridia, including two

genomes of C. difficile, namely C. difficile 630 and C. difficile

qcd32_g58, a representative of the predominant NAP1/BI/

027 strain in Quebec (Loo et al., 2005). The 4.29 Mb genome

of C. difficile strain 630 and its 7.8 kb plasmid encode

a remarkable number of genes associated with resistance

to antimicrobial agents, as well as virulence factors, host

adherents and surface structures (Sebaihia et al., 2007).

Genome sequences have been generated recently for a

further six strains, including CD196, an early, nonepidemic,

ribotype 027 strain (Stabler et al., 2009), the R20291 isolate

responsible for the UK Stoke Mandeville outbreak, and 21

other hypervirulent ribotype 027 strains isolated over the

past two decades (He et al., 2010). A further six hyperviru-

lent isolates associated with the Quebec outbreak and a

reference ATCC43255 strain are at the draft genome se-

quence stage (McGill University and Genome Quebec

Innovation Centre), while the human microbiome project

at Baylor College of Medicine has draft genome sequences

for two strains (NAP07, NAP08) at the time of writing.

These genomic data, along with recently developed tools for

Clostridial functional genomics (Heap et al., 2009), make it

possible for researchers to adopt a systems approach to the

dissection of the physiology and biochemistry of this patho-

gen. To meet the challenges of systems biology, there must be a

comprehensive analysis of individual organisms, linking data

from various genome-wide approaches with that generated

from proteomic investigations (Romijn et al., 2003). Our

laboratory, among others (Graham et al., 2006a, b, 2007; Beck

et al., 2009), has adopted an approach in which fractionation

of whole bacterial cell proteomes into subproteomes reduces

sample complexity and increases the robustness of protein

identifications as the proteome of even a subcellular fraction

remains too complex for complete analysis by one dimension

of LC-MS (Fang et al., 2010). We have previously character-

ized the insoluble proteomes of the Gram-positive bacteria

Geobacillus thermoleovorans T80 and Oceanobacillus iheyensis

HTE831 (Graham et al., 2006b, 2007). These studies have

affirmed, postgenomically, the expression within these organ-

isms of the protein machinery that allows cells to interact with

their environment, with functions including cell–cell signal-

ling, adhesion and stress response, and have shown that

bacteria can express stress-related proteins even under ‘opti-

mal’ laboratory conditions (O’Toole et al., 2010). A number of

stress-related proteins, including molecular chaperones, play a

role in virulence and adhesion in certain pathogens, including,

for example, Helicobacter pylori and Salmonella enterica (Hen-

derson et al., 2006).

The proteomic characterization of bacterial-insoluble

subproteomes has been previously proven to be an effective

strategy in the generation of important physiological and

biochemical information. Therefore, we wished to identify

and characterize this fraction of the C. difficile strain 630

proteome. This approach will provide an insight into the

metabolic processes of actively growing C. difficile cells and

furthermore will complement existing proteomic data sets

from spore and cell-wall subfractions from this organism.

Materials and methods

Microorganism and culture conditions

Clostridium difficile strain 630 was a kind gift from Dr Peter

Mullany of the Eastman Dental Institute (London, UK) and

was routinely maintained on brain–heart infusion (BHI)

agar (Oxoid) in a MACS MG500 Anaerobic workstation

(Don Whitley Scientific, UK) in an 80 : 10 : 10 atmosphere of

N2 : H2 : CO2, at 37 1C. Liquid culture (1 L in glass bottles) was

performed in BHI broth (Oxoid) with resazurin (1 mg L�1)

added as an anaerobic indicator. Overnight cultures in BHI

broth were inoculated with a single colony and used as inocula

at 5% (v/v). Culture growth was followed as attenuance (D) at

650 nm vs. uninoculated BHI broth.

Cell harvest and protein extraction

Mid log-phase cells (D650 nm = 0.5) were harvested from

duplicate 1 L cultures by transferring to two 500 mL centrifuge

bottles in the anaerobic cabinet. Bottle lids were screwed down

tightly and cells were harvested (9000 g, 10 min, 3–5 1C, Beck-

man J2-HS centrifuge/JA10 rotor). The supernatant was

removed inside the anaerobic cabinet and ice-cold 10 mM

phosphate-buffered saline (PBS) (pH 7.8) was added to

resuspend the cells; a second centrifugation washed the cells.

FEMS Microbiol Lett 312 (2010) 151–159c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

152 S. Jain et al.

Page 3: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Cells were resuspended in PBS at a ratio of 1 g cells to 2 mL

buffer inside the anaerobic cabinet.

Cell suspension (1 mL) was added to lysing matrixE tubes

(MP Biomedicals, UK) inside the cabinet and the cells were

lysed mechanically by treatment in a Fastprep150 instrument

for 2 min (4� 30 s treatment, 2-min cooling on ice). Homo-

genates were first centrifuged at 25 000 g to remove unbroken

cells and debris and the resultant supernatants were subse-

quently ultracentrifuged (150 000 g, 2 h, 3–5 1C, Beckman

L8-M centrifuge/70.1 Ti rotor) to pellet the insoluble proteins,

following which the supernatant was removed.

The insoluble pellet was resolubilized by gentle sonication

in resolubilization buffer (1 mL) as described previously

(Graham et al., 2006b), and the protein concentration was

measured using the Bradford (1976) assay. Samples were

reduced and alkylated before electrophoresis and protein

(42mg) from each duplicate was electrophoresed and stained

(Graham et al., 2006b). Lanes were excised from the gel and

cut into seven fractions based on molecular mass and an in-

gel tryptic digest was carried out as described previously

(Graham et al., 2006b).

LC-MS and database searching

LC-MS of peptide samples was performed as described by

Graham et al. (2006a, b) using a 60-min nano-LC gradient.

Protein identification was carried out using an internal MASCOT

SERVER (version 1.9; Matrix Science, London, UK) searching

against a combined C. difficile genomic DNA and plasmid

database (Reference sequence NC_0090989 and NC_008226,

respectively) downloaded from NCBI (20 June 2007) and

containing 3573 sequences in total. Peptide tolerance was set

at 1.2 Da with an MS/MS tolerance of 0.6 Da and the search

set to allow for one missed tryptic cleavage. To expedite

the curation of the identified protein list from MASCOT,

the resultant MASCOT output files were reanalysed against the

extracted C. difficile database using PROVALT (Weatherly et al.,

2005), which takes multiple MASCOT results and identifies

matching peptides. Redundant peptides are removed and

related peptides are grouped together, associated with their

predicted matching protein. PROVALT also uses peptide matches

from a random database (in this case, the C. difficile database

was randomized) to calculate the false-discovery rates (FDR)

for protein identifications as described previously by Weath-

erly et al. (2005). In the current work, FDR was set at 1%; thus,

99% of the proteins identified should be correct.

Results and discussion

Characteristics of the C. difficile-insolublesubproteome

The workflow used in our gel-based analysis firstly isolated

the insoluble fraction of the proteome from duplicate

C. difficile cultures by ultracentrifugation, yielding a protein

concentration of 22.4 mg mL�1. Because of the complex

nature of the peptide mixtures being analysed and the

chance nature of automated selection of peptides for MS

analysis (Graham et al., 2006a, b), the separation capabilities

of the LC-MS system can often be exceeded. However, using

1 D SDS PAGE as a prefractionation step yields high-quality

and reproducible separation of hydrophobic protein mix-

tures (Supporting Information, Table S1) and concomi-

tantly reduces sample complexity before the MS analysis

(Cottingham, 2010), further aiding proteome coverage. For

the C. difficile peptide fractions analysed in this investigation

(Fig. 1), the number of unique proteins identified in a sample

did not increase significantly after three replicate injections

(Fig. S1) and therefore all peptide samples were injected and

analysed three separate times to maximize the overall protein

identification. Stringent automated curation of the data set

using PROVALT set with a FDR of 1% yielded a total of 560

uniquely identified peptides, corresponding to 107 uniquely

identified proteins. The average MOWSE score was 240; the

average number of peptides per protein was five and the

average protein coverage was 24% (Tables S1 and S2).

The proteins identified had widely varying physiochem-

ical characteristics, with the most acidic protein being a

conserved hypothetical protein (CD2522; pI 4.57) and the

most basic being 50S ribosomal protein L20 (pI 11.48). The

lowest molecular mass protein identified was 50S ribosomal

protein L36 (Mr 4277 Da) and the highest was a hypothetical

protein (CD0590; Mr 197 241 Da). We could functionally

categorize all except for three of the proteins identified in

this study according to the SubtiList functional category list

(Graham et al., 2006a, b, 2007) (Table 1). The largest

category of identified proteins was that involved in protein

synthesis (45.8%), followed by that involved in the metabo-

lism of amino acids and related molecules (10.3%). Of the

0

20 000

40 000

60 000

80 000

100000

120000

140000

160000

0 1 2 3 4 5 6 7

Fraction

Mol

ecul

ar m

ass

(Da)

17

Fraction

62

493828

14

Mw(kDa)

Fig. 1. Fractionation of the Clostridium difficile strain 630 insoluble

subproteome by 1 D SDS-PAGE: correlation of molecular mass of

identified proteins with a gel slice.

FEMS Microbiol Lett 312 (2010) 151–159 c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

153Clostridium difficile insoluble proteome

Page 4: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

three ‘uncategorizable’ proteins identified, those encoded by

CD2552 (iojap-like protein) and CD1711 may be part of the

bacterial core genome, a concept proposed by Mulkidjanian

et al. (2006) and further developed in the recent work of

Callister et al. (2008). Homologues of these proteins are also

found in other species of saccharolytic and fermentative

clostridia, in addition to other known gut bacteria including

Roseburia intestinalis and Faecalibacterium prausnitzii (Ami-

nov et al., 2006). The third, CD0590, encodes a conserved

hypothetical protein that has an N-terminal Mg21/GTP-

binding motif as identified by BLASTP analysis. Interestingly,

and in contrast to the other two hypothetical proteins

identified in this study, CD0590 appears to be absent from

all other Clostridia species and indeed yields no significant

homology matches with any other organism in the NCBI

database. The exception to this appears to be a protein

encoded by the adjacent gene, CD0589, which shares

significant homology and appears to represent a duplication

of the N-terminal Mg21/GTP-binding region of CD0590.

All publicly available C. difficile genomes also appear to

contain homologues of both CD0590 and CD0589. As

regards a possible function for protein CD0590, O’Connor

et al. (2006) reported that CD0589 and CD0590 belonged to

an operon consisting of four ORFs, CD0587–CD0590, that

were positively regulated by rgaR, a C. difficile protein

similar to the VirR toxin gene regulator of C. perfringens.

Comparative phylogenomic analysis of C. difficile strains, by

Stabler et al. (2009), showed that the deletion of five specific

genes, including CD0590, was characteristic of a toxin A� /

B1 subclade of C. difficile strains; therefore, it may be

hypothesized that the protein encoded by CD0590 is in

some way important for toxin A production by C. difficile.

However, under the conditions of our study, neither toxin A

nor toxin B was detected.

In a previous study of cell-surface proteins (as distinct from

the insoluble proteins reported here) from C. difficile, Wright

et al. (2005) identified a total of 11 proteins from a glycine

extract of whole cells and a further 42 proteins from a lysozyme

digest of their peptidoglycan layer, resulting in a total of 47

uniquely identified proteins. It is to be expected that different

experimental approaches, including sample types and extrac-

tion methods, will lead to the identification of different

proteomic data for the same organism. For example, the

hypothetical proteins identified by us were distinct from those

detected by Lawley et al. (2009) in the C. difficile spore

proteome. When we compared data from our current investi-

gation with the previous work of Wright et al. (2005), 20

proteins were common to both studies, 27 were unique to

Wright and colleagues and 87 were unique to our work. The

larger number of proteins identified by our bottom-up geLC-

MS approach confirms that this experimental strategy can yield

significant and important biological information to further our

understanding of a microorganism.

An important step towards understanding the function of

a protein is the determination of its subcellular localization,

and in recent years, a number of bioinformatic tools have

been developed to assist with this (Emanuelsson et al.,

2007). Knowledge of Gram-positive bacterial protein target-

ing/secretion is essentially restricted to the model organism

Bacillus subtlis (Tjalsma et al., 2000, 2004), and indeed,

Table 1. Functional categorization of proteins identified within the

insoluble subproteome of Clostridium difficile strain 630.

Functional category

No. of

proteins

Protein

distribution

(%)

Cell wall 3 2.8

Transport/binding proteins and lipoproteins 5 4.7

Sensors (signal transduction) 0 0

Membrane bioenergetics

(electron transport and ATP synthase)

2 1.9

Mobility and chemotaxis 2 1.9

Protein secretion 0 0

Cell division 1 0.9

Sporulation 0 0

Germination 0 0

Transformation/competence 0 0

Specific pathways 7 6.5

Main glycolytic pathway 5 4.7

TCA cycle 0 0

Metabolism of amino acids and related

molecules

11 10.3

Metabolism of nucleotides and nucleic acids 1 0.9

Metabolism of lipids 2 1.9

Metabolism of coenzymes and prosthetic

groups

1 0.9

Metabolism of phosphate 0 0

Metabolism of sulphur 0 0

DNA replication 0 0

DNA restriction/modification and repair 0 0

DNA recombination 0 0

DNA packaging and segregation 0 0

RNA synthesis 0 0

RNA modification 2 1.9

Protein synthesis ribosomal proteins 49 45.8

Protein synthesis aminoacyl-tRNA

synthetases

2 1.9

Protein synthesis initiation 2 1.9

Protein synthesis, elongation 2 1.9

Protein synthesis, termination 0 0

Protein modification 0 0

Protein folding 4 3.7

Adaptation to atypical conditions 1 0.9

Detoxification 1 0.9

Antibiotic production 0 0

Phage-related functions 0 0

Transposon and IS 1 0.9

Miscellaneous 0 0

Similar to unknown proteins 3 2.8

No similarity 0 0

FEMS Microbiol Lett 312 (2010) 151–159c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

154 S. Jain et al.

Page 5: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Desvaux et al. (2005) state that protein secretion by clos-

tridia in general is ‘poorly understood’. As the insoluble

proteome might be expected to contain proteins associated

with, or targeted to, either the cell membrane or the

extracellular milieu, and that could thus play a role in

virulence, we therefore used PSORTB (Gardy et al., 2005),

SIGNALP (Bendtsen et al., 2004) and SECRETOMEP (Bendtsen

et al., 2005) to guide our efforts to assign a subcellular

location for each protein.

All 107 proteins identified in this study were analysed and

assigned a putative or a predicted cellular localization as

shown in the workflow depicted in Fig. 2. Within the subset

of proteins predicted to be secreted, 23 were identified as

possessing an N-terminal signal peptide (Table 2). Three cell

wall-associated proteins were identified including two SlpA

variants and a recently characterized cysteine protease,

Cwp84, which Kirby et al. (2009) have shown is required

for maturation of the S-layer, but that is not essential for

virulence. Of the two proteins classified as ABC transporters,

neither conformed to the expected architecture for such a

protein, namely, a leader peptide containing an N- and C-

domain completely lacking an intervening hydrophobic

domain, in addition to a double-glycine motif N-terminal

of the signal peptide cleavage site. All the other ‘transport’

proteins identified contained a significant hydrophobic

domain between the N- and the C-domain of the predicted

signal peptide, in addition to a number of other motifs

usually associated with the twin arginine translocation or

Predicted nonsecretory

55 proteins

Predicted noncytoplasmicwith no signal peptide

6 proteins

Insoluble proteome of C. difficile strain 630

107 proteins

PSortB

Predicted protein localization,cytoplasmic with no helical domains

58 proteins

All other predictedlocalizations

49 proteins

SignalP and SecretomeP analysis

SignalP and SecretomeP analysis

Predicted secretory

3 proteins

Predicted signal peptide

18 proteins

Predicted cytoplasmic

28 proteins

Predicted both signalpeptide and

nonclassically secreted13 proteins

Predictednonclassically secreted6 proteins

Fig. 2. Bioinformatics workflow for the prediction of Clostridium difficile strain 630 protein subcellular location.

FEMS Microbiol Lett 312 (2010) 151–159 c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

155Clostridium difficile insoluble proteome

Page 6: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Sec secretion pathways. None of the 23 proteins contained

any C-terminus cell wall anchor motifs commonly found in

Gram-positive bacteria, such as LPxTG, NPQTN or TLxTC

(Dramsi et al., 2005; Desvaux et al., 2006).

As in our previous work, we used the pathway recon-

struction tool BIOCYC (Karp et al., 2005) to analyse pathways

inferred from our proteomics dataset. The snapshot of

C. difficile metabolism presented here reflects the nutritional

complexity of BHI broth, which contains glucose, proteose

peptone and bovine BHI solids. We could, therefore, recon-

struct a number of key central metabolic pathways (Djord-

jevic et al., 2003) that would be expected to be active in

clostridial cells including glycolysis, mixed acid fermenta-

tion and fermentation of amino acids (Gottschalk, 1979)

(see Figs. S1-S3). The metabolic processes we have identified

in C. difficile are, therefore, broadly similar to those

described in a recent proteomic investigation of the Gram-

negative gut anaerobe, Fusobacterium varium. Potrykus et al.

(2008) report that F. varium may play both beneficial and

pathogenic roles in the human gut. While the antics of C.

difficile left unchecked have given it a deservedly bad reputation

(Heap et al., 2009), its ability to produce butyrate (Fig. S3), as

is known to occur in F. varium, could mean that in asympto-

matic carriers of C. difficile, the organism has the potential to

contribute to colonocyte health. Such a counterintuitive hy-

pothesis highlights the need, not only from a basic science

perspective but also from a position of concern for public

health, to know the frequency of asymptomatic

C. difficile carriers within the general population: therefore, we

see an urgent requirement to develop a better understanding of

C. difficile biology within the human microbiome.

The pathogenicity of C. difficile is dependent on a combina-

tion of toxin synthesis, p-cresol production and a diverse

range of amino acid fermentations (Kim et al., 2008). Leucine

Table 2. . Proteins identified within the insoluble subproteome of Clostridium difficile strain 630 with predicted export signals�

Protein Function Signal peptide

Cell wall

CD2793T, S Cell-surface protein (S-layer precursor protein) slpA MN|KKNI|AIAMSGLTVLASAAPVFA

CD2791T, S Cell-surface protein (putative S-layer protein precursor) MN|KKNL|SVIMAAAMISTSVAPVFA

CD2787T, S Cell-surface protein (putative cell surface-associated cysteine protease cwp84 MRKYKSKKLSKLLALLTVCFLIVSTIPVSA

ABC transporters

CD0873L, T, S ABC transporter, substrate-binding lipoprotein MIN�KKRL�ASLILAGALSISMLTGCSQG

CD2672L, T, S Oligopeptide ABC transporter, substrate-binding protein appA MKF�KKLA�SLILVSSLMLTFTACA

Other transporters

CD2667T, S PTS system, glucose-specific IIbc component EC 2.7.1.69 ptsG M�KKVF�GVLQKVGKSLMLPVALLPAAGILLGVSNALA

CD3014T, S PTS system, IIb component frwB M�KRKI�IAVTACATGVAHTYMAAQA

Mobility and chemotaxis

CD3513T Pilin precursor (comGD site # ) MKLKKN�KKG # F�TLVELLVVIAIIGILAVVAVPALF

Amino acid metabolism

CD0107T Aspartate aminotransferase EC 2.6.1.1 aspC MLS�KRLN�FITPSYTIGISSKVKEM

Ribosomal proteins

gi|115249080T 50S ribosomal protein L2 rplB MAI�KKFRP�TSPALRQMTVLVSD

gi|115249078 50S ribosomal protein L4 rplDw MTNLEKGGITMPKLNVLNVSGQNVGEIELS

gi|115249703T 50S ribosomal protein L20 rplT MARV�KKAM�NARKKHKKILKLAKGFRGSRSKLYRPA

gi|115250193 50S ribosomal protein L21 rplUw MYAIVKTGGKQYKVSEGDVLFVEKLEANAG

gi|115249088 50S ribosomal protein L24 rplXw MMRVKKGDTVVVIAGKDKGKKGSVLKVYPK

gi|115252547 50S ribosomal protein L31 rpmEw MQKEIQPKYNPVEVRCACGNTFVAGSTKDE

gi|115249104TB 30S ribosomal protein S11 rpsK MAKPKKKVTRI|RR|RERKNIERGHA

gi|115249084TB 50S ribosomal protein L16 rplP MLMPKRVK|RR|RVHRGSMAGQAHKGNKVTYG

gi|115249102 50S ribosomal protein L36 rpmJ MKVRPSVKPICEKCKVIKRKGKVMVICENP

gi|115250287 30S ribosomal protein S16 rpsP MLMPKRVKRRRVHRGSMAGQA

gi|115249062 50S ribosomal protein L33 rpmG MRVKVTLA

gi|115250209TB 50S ribosomal protein L32 rpmF MAVPKRKTSKSNTKMRRA

gi|115249081T 30S ribosomal protein S19 rpsS MSRST�KKGP�FVHARLLKKIEAMN

Protein synthesis initiation

CD0590 Hypothetical protein with N-terminal OB-fold RNA-binding domainw MANKLYSEIVNLLEEGRDELRKYDLKEKSI

�Putative signal peptides were predicted using the method of Tjalsma et al. (2000, 2004). The hydrophobic H domain is coloured grey and the predicted

signal peptide cleavage sites are the last three amino acid residues (bold and underlined).wProteins predicted as ‘nonclassically secreted’ and lacking a signal peptide by SecretomeP (Bendtsen et al., 2005).

S, proteins likely to be secreted by the Sec pathway; L, lipoproteins; lipobox is italicized and bold; TB, twin arginine motifs – bold residues enclosed within

| |; T, potential Tat pathway signal peptides, bold amino acids enclosed within ��.

FEMS Microbiol Lett 312 (2010) 151–159c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

156 S. Jain et al.

Page 7: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

is reported to be indispensible for the growth of this organism

and may be metabolized by a reductive pathway, to isocapro-

ate, or by means of an alternative oxidative pathway in which

isovalerate and ammonia are produced. Thus, unlike the

typical Stickland reaction, here, leucine may serve both as an

oxidant and as a reductant (Kim et al., 2006). In the present

study, we identified seven of the eight proteins necessary for

the reductive branch of the leucine fermentation pathway (Fig.

3), with the sole exception of the ATP-dependent activator

protein, HadI (Kim et al., 2005). While leucine fermentation is

of fundamental importance to C. difficile growth and patho-

genesis, the pathway is also of significant scientific interest as it

involves a novel mechanism to generate the necessary radicals

for the dehydration of 2-hydroxyisocaproyl-CoA to 2-isoca-

prenoyl-CoA, which does not depend on the typical radical

generators such as oxygen, coenzyme B12 or S-adenosyl

methionine (Kim et al., 2008). Clostridia are hypothesized to

have emerged some 2.34 billion years ago and C. difficile

between 1.1 and 85 million years ago (He et al., 2010), thus

supporting the hypothesis put forward by Kim et al. (2008)

that these reactions, which proceed via a novel allylic ketyl

radical intermediate, represent an evolutionarily ancient

means for radical formation in bacteria. Given the organismal

and scientific importance of this pathway and our success in

the identification of the majority of its proteins, it should be

possible, in conjunction with other ‘omic technologies, to

develop a model for leucine metabolism within C. difficile.

This would represent one step towards the development of a

systems understanding of this microorganism.

Concluding remarks

In this study, our GeLC-MS proteomics approach identified

C. difficile 630 proteins expressed during mid-log phase

growth in BHI broth. Therefore, this extends the proteomics

information for C. difficile, allowing the reconstruction of

several central metabolic pathways, including the reductive

branch of the leucine fermentation pathway. The Clostridial

research community is in a position now wherein the

increasing availability of genomic, transcriptomic and pro-

teomic information for C. difficile should enable the genera-

tion of datasets that are sufficiently robust to enable systems

biologists to develop metabolic models for this clinically

important microorganism. This should allow predictions to

be made regarding the roles and expression of key virulence

determinants and lead to the rapid identification of cellular

targets for therapeutic purposes.

References

Aminov RI, Walker AW, Duncan SH, Harmsen HJM, Welling GW

& Flint HJ (2006) Molecular diversity, cultivation, and

improved FISH detection of a dominant group of human gut

bacteria related to Roseburia and Eubacterium rectale. Appl

Environ Microb 72: 6371–6376.

Anonymous (2006) Investigation into Outbreaks of Clostridium

difficile at Stoke Mandeville Hospital, Buckinghamshire

Hospitals NHS Trust. Commission for Healthcare Audit and

Inspection, London.

Bartlett JG (2006) Narrative review: the new epidemic of

Clostridium difficile-associated enteric disease. Ann Int Med

145: 758–764.

Bartlett JG & Gerding DN (2008) Clinical recognition and

diagnosis of Clostridium difficile infection. Clin Infect Dis 46

(suppl 1): s12–s18.

Beck HC, Madsen SM, Glenting J, Petersen J, Israelsen H,

Nørrelykke MR, Antonsson M & Hansen AM (2009)

Proteomic analysis of cell surface-associated proteins from

probiotic Lactobacillus plantarum. FEMS Microbiol Lett 297:

61–66.

Fig. 3. Genomic context of genes in Clostridium difficile strain 630 encoding the protein machinery of the reductive branch of the leucine fermentation

pathway.

FEMS Microbiol Lett 312 (2010) 151–159 c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

157Clostridium difficile insoluble proteome

Page 8: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Bendtsen JD, Nielsen H, von Heijne G & Brunak S (2004)

Improved prediction of signal peptides: SignalP 3.0. Mol Biol

340: 783–795.

Bendtsen JD, Kiemer L, Fausbøll A & Brunak S (2005) Non-

classical protein secretion in bacteria. BMC Microbiol 5: 58–70.

Bradford MM (1976) A rapid and sensitive detection method for

the quantitation of microgram quantities of protein utilizing

the principle of protein-dye binding. Anal Biochem 72:

248–254.

Callister SJ, McCue LA, Turse JE, Monroe ME, Auberry KJ,

Smith RD, Adkins JN & Lipton MS (2008) Comparative

bacterial proteomics: analysis of the core genome concept.

PLoS ONE 3: e1542.

Chaudhuri RR & Pallen MJ (2006) xBASE, a collection of online

databases for bacterial comparative genomics. Nucleic Acids

Res 34: D335–D337: database issue.

Cottingham K (2010) 1DE proves its worth . . . again. J Proteome

Res 9: 1636.

Desvaux M, Khan A, Scott-Tucker A, Chaudhuri RR, Pallen MJ &

Henderson IR (2005) Genomic analysis of the protein

secretion systems in Clostridium acetobutylicum ATCC 824.

Biochim Biophys Acta 1745: 223–253.

Desvaux M, Dumas E, Chafsey I & Hebraud M (2006) Protein cell

surface display in Gram-positive bacteria: from single protein

to macromolecular protein structure. FEMS Microbiol Lett

256: 1–15.

Djordjevic MA, Chen HC, Natera S, Van Noorden G, Menzel C,

Taylor S, Renard C, Geiger O & Weller GF (2003) A global

analysis of protein expression profiles in Sinorhizobium

meliloti: discovery of new genes for nodule occupancy and

stress adaptation. Plant Microbe In 16: 508–524.

Dramsi S, Trieu-Cuot P & Bierne H (2005) Sorting sortases: a

nomenclature proposal for the various sortases of Gram-

positive bacteria. Res Microbiol 156: 289–297.

Durai R (2007) Epidemiology, pathogenesis, and management of

Clostridium difficile infection. Dig Dis Sci 52: 2958–2962.

Emanuelsson O, Brunak S, von Heijne G & Nielsen H (2007)

Locating proteins in the cell using TargetP, SignalP and related

tools. Nat Protoc 2: 953–971.

Fang Y, Robinson DP & Foster LJ (2010) Quantitative analysis of

proteome coverage and recovery rates for upstream

fractionation methods in proteomics. J Proteome Res 9:

1902–1912.

Gardy JL, Laird MR, Chen F, Rey S, Walsh CJ, Ester M &

Brinkman FSL (2005) PSORTb v.2.0: expanded prediction of

bacterial protein subcellular localization and insights gained

from comparative proteome analysis. Bioinformatics 21:

617–623.

Gottschalk G (1979) Bacterial Metabolism. Springer Verlag Inc.,

New York.

Graham RL, Pollock CE, Ternan NG & McMullan G (2006a) Top-

down proteomic analysis of the soluble sub-proteome of the

obligate thermophile, Geobacillus thermoleovorans T80:

insights into its cellular processes. J Proteome Res 5: 822–828.

Graham RL, Pollock CE, O’Loughlin SN, Ternan N, Weatherly

DB, Tarleton RL & McMullan G (2007) Multidimensional

analysis of the insoluble sub-proteome of Oceanobacillus

iheyensis HTE831, an alkaliphilic and halotolerant deep-sea

bacterium isolated from the Iheya ridge. Proteomics 7: 82–91.

Graham RLJ, O’Loughlin SN, Pollock CE, Ternan NG, Weatherly

DB, Jackson PJ, Tarleton RL & McMullan G (2006b) A

combined shotgun and multidimensional proteomic analysis

of the insoluble subproteome of the obligate thermophile,

Geobacillus thermoleovorans T80. J Proteome Res 5: 2465–2473.

Green S, Cortes N, Prime K & Hadfield S (2007) Molecular

epidemiology of Clostridium difficile in Southampton

2001–2005. J Infection 55: e93.

Hall IC & O’Toole E (1935) Intestinal flora in newborn infants

with a description of a new pathogenic anaerobe, Bacillus

difficilis. Am J Dis Child 49: 390–402.

He M, Sebaihia M, Lawley TD et al. (2010) Evolutionary

dynamics of Clostridium difficile over short and long time

scales. P Natl Acad Sci US A 107: 7527–7532.

Heap JT, Pennington OJ, Cartman ST & Minton NP (2009) A

modular system for Clostridium shuttle plasmids. J Microbiol

Meth 78: 79–85.

Henderson B, Allan E & Coates ARM (2006) Stress wars: the

direct role of host and bacterial molecular chaperones in

bacterial infection. Infect Immun 74: 3693–3706.

Jank T, Giesemann T & Aktories K (2007) Rho-glucosylating

Clostridium difficile toxins A and B: new insights into structure

and function. Glycobiology 17: 15R–22R.

Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa

P, Ahren D, Tsoka S, Darzentas N, Kunin V & Lopez-Bigas N

(2005) Expansion of the BioCyc collection of pathway/genome

databases to 160 genomes. Nucleic Acids Res 19: 6083–6089.

Kim J, Darley D & Buckel W (2005) 2-Hydroxyisocaproyl-CoA

dehydratase and its activator from Clostridium difficile. FEBS J

272: 550–561.

Kim J, Darley D, Selmer T & Buckel W (2006) Characterization of

(R)-2-hydroxyisocaproate dehydrogenase and a family III

coenzyme A transferase involved in reduction of L-leucine to

isocaproate by Clostridium difficile. Appl Environ Microb 72:

6062–6069.

Kim J, Darley DJ, Buckel W & Pierik AJ (2008) An allylic ketyl

radical intermediate in clostridial amino acid fermentation.

Nature 452: 239–242.

Kirby JM, Ahern H, Roberts AK, Kumar V, Freeman Z, Acharya

KR & Shone CC (2009) Cwp84, a surface-associated cysteine

protease, plays a role in the maturation of the surface layer of

Clostridium difficile. J Biol Chem 284: 34666–34673.

Kuijper E, Coignard B, Brazier J et al. (2007) Update of

Clostridium difficile-associated disease due to PCR ribotype

027 in Europe. Euro Surveil 12: 163–166.

Lawley TD, Croucher NJ, Yu L, Clare S, Sebaihia M, Goulding D,

Pickard DJ, Parkhill J, Choudhary J & Dougan G (2009)

Proteomic and genomic characterization of highly infectious

Clostridium difficile 630 spores. J Bacteriol 191: 5377–5386.

FEMS Microbiol Lett 312 (2010) 151–159c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

158 S. Jain et al.

Page 9: Proteomic analysis of the insoluble subproteome of Clostridium difficile strain 630

Loo VG, Poirier L, Miller MA et al. (2005) A predominantly

clonal multi-institutional outbreak of Clostridium difficile-

associated diarrheal with high morbidity and mortality. N Engl

J Med 353: 2442–2449.

Lyras D, O’Connor JR, Howarth PM et al. (2009) Toxin B is

essential for virulence of Clostridium difficile. Nature 458:

1176–1179.

Marcos LA & DuPont HL (2007) Advances in etiology and new

therapeutic approaches in acute diarrheal. J Infection 55:

385–393.

Mulkidjanian AY, Koonin EV, Makarova KS et al. (2006) The

cyanobacterial genome core and the origin of photosynthesis.

P Natl Acad Soc USA 103: 13126–13131.

O’Connor JR, Lyras D, Farrow KA, Adams V, Powell DR, Hinds J,

Cheung JK & Rood JI (2006) Construction and analysis of

chromosomal Clostridium difficile mutants. Mol Microbiol 61:

1335–1351.

O’Toole PW, Snelling WJ, Canchaya C et al. (2010) Comparative

genomics and proteomics of Helicobacter mustelae, an

ulcerogenic and carcinogenic gastric pathogen. BMC Genomics

11: 164.

Pepin J, Valiquette L & Cossette B (2005) Mortality attributable to

nosocomial Clostridium difficile-associated disease during an

epidemic caused by a hypervirulent strain in Quebec. Can Med

Am J 173: 1037–1042.

Potrykus J, White RL & Bearne SL (2008) Proteomic investigation

of amino acid catabolism in the indigenous gut anaerobe

Fusobacterium varium. Proteomics 8: 2691–2703.

Poutanen SM & Simor AE (2004) Clostridium difficile-associated

diarrheal in adults. Can Med Am J 171: 51–58.

Riley TV (1998) Clostridium difficile: a pathogen of the nineties.

Eur J Clin Microbiol Infect Dis 17: 137–141.

Romijn EP, Krijgsveld J & Heck AJ (2003) Recent liquid

chromatographic-(tandem) mass spectrometric applications

in proteomics. J Chromatogr A 1000: 589–608.

Sebaihia M, Wren BW, Mullany P et al. (2007) The multidrug-

resistant human pathogen Clostridium difficile has a highly

mobile, mosaic genome. Nat Genet 38: 779–786.

Spigaglia P & Mastrantonio P (2002) Molecular analysis of the

pathogenicity locus and polymorphism in the putative negative

regulator of toxin production (TcdC) among Clostridium

difficile clinical isolates. J Clin Microbiol 40: 3470–3475.

Stabler RA, Gerding DN, Songer JG, Drudy D, Brazier JS, Trinh HT,

Witney AA, Hinds J & Wren BW (2006) Comparative

phylogenomics of Clostridium difficile reveals clade specificity

and microevolution of hypervirulent strains. J Bacteriol 188:

7297–7305.

Stabler RA, He M, Dawson L, Martin M et al. (2009) Comparative

genome and phenotypic analysis of Clostridium difficile 027

strains provides insight into the evolution of a hypervirulent

bacterium. BMC Genome Biol 10: R102.

Tjalsma H, Bolhuis A, Jongbloed JDH, Bron S & van Dijl JM

(2000) Signal peptide-dependent protein transport in Bacillus

subtilis: a genome-based survey of the SECRETOME. Microbiol

Mol Biol Rev 64: 515–547.

Tjalsma H, Antelmann H, Jongbloed JD et al. (2004) Proteomics

of protein secretion by Bacillus subtilis: separating the ‘‘secrets’’

of the SECRETOME. Microbiol Mol Biol Rev 68: 207–233.

Voth DE & Ballard JD (2005) Clostridium difficile toxins:

mechanism of action and role in disease. Clin Microbiol Rev 18:

247–263.

Warny M, Pepin J, Fang A, Killgore G, Thompson A, Brazier J,

Frost E & McDonald LC (2005) Toxin production by an

emerging strain of Clostridium difficile associated with

outbreaks of severe disease in North America and Europe.

Lancet 366: 1079–1084.

Weatherly DB, Atwood JA III, Minning TA, Cavola C, Tarleton RL

& Orlando R (2005) A heuristic method for assigning a false-

discovery rate for protein identifications from mascot database

search results. Mol Cell Proteomics 4: 762–772.

Wright A, Wait R, Begum S, Crossett B, Nagy J, Brown K &

Fairweather NF (2005) Proteomic analysis of cell surface

proteins from Clostridium difficile. Proteomics 9: 2443–2452.

Supporting information

Additional Supporting Information may be found in the

online version of this article:

Appendix S1. Overview of, and commentary on metabolic

pathways active in Clostridium difficile strain 630.

Fig. S1. Number of unique Clostridium difficile strain 630

proteins identified in a mixed protein sample with repeated

injection to LC-MS.

Fig. S2. Glycolysis and pentose phosphate pathway: showing

proteins (boxed) identified in this investigation.

Fig. S3. Mixed acid fermentation: showing proteins (boxed)

identified in this investigation.

Fig. S4. GABA metabolism: showing proteins (boxed) identi-

fied in this investigation.

Table S1. Excel Spreadsheet with details of all proteins

identified in this investigation, including molecular mass,

pI, mowse score, signal peptide analysis etc.

Table S2. PROVALT html output file with details of all peptides

identified for each protein in this investigation, including

number of spectra, sequences, mowse scores, % coverage, etc.

Please note: Wiley-Blackwell is not responsible for the

content or functionality of any supporting materials supplied

by the authors. Any queries (other than missing material)

should be directed to the corresponding author for the article.

FEMS Microbiol Lett 312 (2010) 151–159 c� 2010 Federation of European Microbiological SocietiesPublished by Blackwell Publishing Ltd. All rights reserved

159Clostridium difficile insoluble proteome