Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and...

30
www.sciencemag.org/cgi/content/full/science.1229663/DC1 Supplementary Materials for Natively Inhibited Trypanosoma brucei Cathepsin B Structure Determined Using an X-ray Laser Lars Redecke, Karol Nass, Daniel P. DePonte, Thomas A. White, Dirk Rehders, Anton Barty, Francesco Stellato, Mengning Liang, Thomas R.M. Barends, Sébastien Boutet, Garth J. Williams, Marc Messerschmidt, M. Marvin Seibert, Andrew Aquila, David Arnlund, Sasa Bajt, Torsten Barth, Michael J. Bogan, Carl Caleman, Tzu-Chiao Chao, R. Bruce Doak, Holger Fleckenstein, Matthias Frank, Raimund Fromme, Lorenzo Galli, Ingo Grotjohann, Mark S. Hunter, Linda C. Johansson, Stephan Kassemeyer, Gergely Katona, Richard A. Kirian, Rudolf Koopmann, Chris Kupitz, Lukas Lomb, Andrew V. Martin, Stefan Mogk, Richard Neutze, Robert L. Shoeman, Jan Steinbrener, Nicusor Timneanu, Dingjie Wang, Uwe Weierstall, Nadia A. Zatsepin, John C.H. Spence, Petra Fromme, Ilme Schlichting, Michael Duszenko, Christian Betzel*, Henry N. Chapman* *To whom correspondence should be addressed. E-mail: [email protected] (H.N.C.) or [email protected] (C.B.) Published 29 November 2012 on Science Express DOI: 10.1126/science.1229663 This PDF file includes Materials and Methods Supplementary Text Figs. S1 to S10 Tables S1 to S5 Full References

Transcript of Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and...

Page 1: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

www.sciencemag.org/cgi/content/full/science.1229663/DC1

Supplementary Materials for

Natively Inhibited Trypanosoma brucei Cathepsin B Structure Determined Using an X-ray Laser

Lars Redecke, Karol Nass, Daniel P. DePonte, Thomas A. White, Dirk Rehders, Anton Barty, Francesco Stellato, Mengning Liang, Thomas R.M. Barends, Sébastien Boutet, Garth J. Williams, Marc Messerschmidt, M. Marvin Seibert, Andrew Aquila, David

Arnlund, Sasa Bajt, Torsten Barth, Michael J. Bogan, Carl Caleman, Tzu-Chiao Chao, R. Bruce Doak, Holger Fleckenstein, Matthias Frank, Raimund Fromme, Lorenzo Galli, Ingo Grotjohann, Mark S. Hunter, Linda C. Johansson, Stephan Kassemeyer, Gergely Katona, Richard A. Kirian, Rudolf Koopmann, Chris Kupitz, Lukas Lomb, Andrew V. Martin, Stefan Mogk, Richard Neutze, Robert L. Shoeman, Jan Steinbrener, Nicusor

Timneanu, Dingjie Wang, Uwe Weierstall, Nadia A. Zatsepin, John C.H. Spence, Petra Fromme, Ilme Schlichting, Michael Duszenko, Christian Betzel*, Henry N. Chapman*

*To whom correspondence should be addressed. E-mail: [email protected] (H.N.C.) or [email protected] (C.B.)

Published 29 November 2012 on Science Express

DOI: 10.1126/science.1229663

This PDF file includes

Materials and Methods Supplementary Text Figs. S1 to S10 Tables S1 to S5 Full References

Page 2: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

2

Materials and Methods TbCatB expression and isolation of in vivo crystals The pre-pro-form of TbCatB was expressed in SF9 insect cells infected with recombinant baculovirus as previously described (11). In vivo grown TbCatB crystals were isolated after 8 days, yielding approximately 108 crystals per ml which were stored in 1x phosphate buffered saline at room temperature. Electron microscopy Transmission electron microscopy (TEM) was performed as previously described (11). The size-distribution of the in vivo grown TbCatB crystals was analyzed applying scanning electron microscopy (SEM). A FEI Helios Nanolab 600 microscope was used to image the purified microcrystals. A 1 µl droplet of crystal suspension was deposited on a polished silicon substrate and dried under clean nitrogen. The instrument was operated with an electron accelerating voltage of 5 kV and a current of 0.89 pA. The pressure in the chamber was 10-6 mbar. Needle shaped crystals with an average width of 0.9 µm and an average length of 11 µm were observed. Mass spectrometry Matrix-assisted laser desorption/ionization (MALDI) time-of-flight (TOF) spectrometry were performed on an ultrafleXtreme instrument (Bruker Daltonics, Germany). In vivo grown crystals were dissolved in acidic buffer as previously described (11). 2 µl of this sample solution were mixed with 2 µl of 2% TFA and 2 µl of a matrix solution containing 10 mM 2,6-dihydroxyacetophenone (DHAP) and 20 mM diammonium hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power: 47%; laser shots: 2000). MS spectra were processed using flexAnalysis (v3.3, Bruker Daltonics). Serial femtosecond crystallography SFX experiments were carried out at the Coherent X-ray Imaging (CXI) instrument (19) at the Linac Coherent Light Source (LCLS) in the SLAC National Accelerator Laboratory (Menlo Park, California, USA) (30). X-ray pulses of 40-fs duration containing 4 x 1011 photons/pulse at a wavelength of 1.32 Å (9.4 keV) were focused to ~4 µm diameter at the interaction point using beryllium compound refractive lenses, corresponding to a peak power density in excess of 1017 W cm-2 and a sample dose of up to 31 MGy. The electron and photon beam parameters are summarized in table S1. Purified in vivo grown TbCatB crystals were crushed prior to measurements as previously described (11). The crystal number density was adjusted to approximately 2 × 109 crystals per ml to optimize the hit rate. The crystals were injected into the FEL beam in phosphate buffered saline solution using a liquid microjet (21) focused to a diameter of about 4 μm at a flow rate of 10 μl min−1. A rotating syringe device (31) was used for sample storage and delivery to avoid settling of the crystals during the measurement time. The position of the x-ray beam was adjusted to intersect the continuous liquid column, before the Rayleigh break-up of the jet into drops. Single shot diffraction patterns of randomly oriented crystals were recorded at 120 Hz repetition rate (7,200 patterns per minute) by a Cornell-SLAC Pixel Array Detector

Page 3: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

3

(CSPAD) consisting of 64 independently controlled panels of 192 × 185 pixels each, forming a 1516 × 1516 pixel array (arranged into 1702 × 1702 pixels with small gaps between the panels) with a hole in the centre to let the intense x-ray pulses pass through (19, 20). The pixel size was 110 × 110 µm2. The distance of the sample to the detector was set to 77 mm, corresponding to a maximum resolution of 1.3 Å at the detector edge. 3,953,201 detector frames were recorded, 293,195 of which contained single-crystal diffraction patterns. Peak detection and local background correction were performed using the Cheetah software package (http://www.desy.de/~barty/cheetah/). The first step of pre-processing is to subtract a dark frame from each measurement frame. Coherent scattering from the liquid jet gives rise to strong diffraction at low angles mainly in the perpendicular direction to the jet. These regions are identified and masked. The scattering from the room-temperature water jet also gives rise to a broad isotropic ring at a scattering angle corresponding to about 3.5 Å resolution. The strength of this ring varies significantly from shot to shot due to fluctuations of the jet position and x-ray pulse intensity. This ring is removed by a median filter that discriminates Bragg peaks from this slowly varying background. The measurement frames are then discriminated for crystal “hits” by locating peaks that lie above a threshold of 50 detector counts per pixel. Frames that contain more than 10 detected peaks are deemed a hit. The background from each hit is then further refined by computing, at each pixel, the median pixel value of a square 2x2 region around that pixel, then subtracting this from the measurement. These background-corrected patterns of ten or more Bragg peaks are then passed to the CrystFEL software package (23) for indexing and averaging, applying the unit cell parameters of in vivo grown TbCatB crystals previously determined (11). A total of 178,875 diffraction patterns yielded a complete set of structure factors from 20 to 2.1 Å resolution merged from 25,969 reflections. As indicated by an I/σ(I) value of 2.37 in the highest resolution shell, the dataset is of sufficient quality to perform structural refinement within this resolution range. Quality measures are summarized in table S2, table S3 and fig. S4. It is interesting to note that the standard deviations of the unit cell dimensions are approximately 0.5 %, which is assumed to be mainly a result of non-isomorphism of the in vivo grown crystals, but also to a lesser extend to the limited pixel count in the detector. Structure determination The SFX data was phased by molecular replacement using MOLREP (32) and the coordinates of the mature active form of TbCatB as a search model (Protein Data Bank ID, 3MOR), which exhibits an amino acid sequence identity of 86 % to the T. brucei procathepsin B determined in this study due to the missing propeptide sequence. During different stages of model building and refinement using COOT (33) and REFMAC5 (34), respectively, 62 propeptide residues, 5 carbohydrate residues, and 98 water molecules were placed in a difference electron density map. The structure of the TbCatB-propeptide complex was refined at a resolution of 2.1 Å to an R-factor of 18.1 % (Rfree = 21.4 %). Refinement statistics are summarized in table S2.

Page 4: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

4

Structure analysis and illustrations The propeptide-enzyme interface was characterized with the PISA server (35) at the European Bioinformatics Institute. For direct comparison, residues equivalent to Lys73 to Ser78 of the TbCatB propeptide that are not defined by electron density in the SFX structure were also deleted in the human procathepsin B structure (Protein Data Bank ID, 3PBH). All illustrations were prepared using PyMol v1.3 (DeLano Scientific; http://www.pymol.org).

Supplementary Text S1. The TbCatB structure

The TbCatB structure determined by SFX exhibits the typical papain-fold that is highly conserved for cathepsin B-like proteases (25), which consists of two spherical left (L-) and right (R-) domains of similar size (Fig. 1C). The substrate-binding pocket is located in an extended and narrow cleft between the two domains, containing the active site Cys (L-domain) and His (R-domain) residues within the so-called S1 pocket, together with a series of N- (non-prime site: S2 and S3) and C-terminal (prime site: S1′ and S2′) subsites (28). In mature CatB, the highly flexible occluding loop, which is a unique insertion within the L-domain of cathepsin B-like proteases, blocks the S2′ and in part the S1′ subsite, conferring additional exo-peptidase activity.

S2. Proteolytic maturation of T. brucei procathepsin B

Cysteine proteases are expressed as inactive zymogens with a covalently bound propeptide that suppresses the activity of the catalytic mature domain. Conversion into the active (mature) form occurs by proteolytic cleavage of the propeptide. For cathepsin B, a multistep auto-activation process forming different intermediate cleavage states on the pathway was suggested upon exposure to acidic pH environments in the lysosome, which enhances the conformational flexibility of the propeptide due a pH-dependent destabilization of its interaction with the enzyme domain (36).

Mass spectrometric (MS) analysis of the dissolved in vivo grown TbCatB crystals showed a dominating mass peak of 7947.01 g/mol. This mass corresponds well to the theoretical mass (7949.71 g/mol) of the propeptide residues Val25 to Ser78 with a carbohydrate structure of nine mannose and two N-acetylglucosamin (NAG) residues that is characteristic for high-mannose glycosylation of insect cells N-linked to the consensus motif Asn58-Ile59-Thr60. Consequently, proteolytic maturation of T. brucei procathepsin B appears to proceed C-terminal to Ser78 after recombinant protein expression within the insect cells, although the cleaved propeptide sequence remained bound to the substrate binding site. Residues Lys73 to Ser78 (C-terminal) are suggested to be not defined by electron density in the crystal structure due to an increased conformational flexibility. This also applies to Val25, which is suggested to be the N-terminus of the bound propeptide, based on the MS data. The N-terminus of mature TbCatB is therefore extended by 15 propeptide residues compared to mammalian CatB, which usually starts at residue Leu80 (human CatB numbering, fig. S9) (37), consistent with crystal structures of mature TbCatB (10, 11). Further experiments are required to investigate the unusual processing of T. brucei procathepsin B and the potential impact of exogeneous proteases within the activation process.

Page 5: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

5

S3. The propeptide structure The coordinated propeptide chain wraps around the enzymatic domain of TbCatB in

a C-shaped fold as observed for mammalian procathepsin B structures (13, 38), preventing access to the active site (fig. 2A). Although the propeptides of T. brucei (71 aa) and human (62 aa) CatB share a relatively low overall sequence similarity of 47 % (fig. S9), two main secondary structure elements, a three-turn α-helix (1) / antiparallel β-strand motif and a two-turn α-helix (2), are highly conserved along with the stabilizing internal hydrogen bond network (fig. S5). The other parts of the T. brucei propeptide exhibit conformational deviations up to 6 Å, associated with an enhanced internal stabilization compared to the human propeptide structure. This is largely due to (i) a tight intramolecular anchoring of the N-terminal residues at helix 2 by four non-conserved hydrogen bonds that strengthen the compact turn formation within the kinked region, (ii) an optimized stabilization of the helix 1 C-terminus by two H-bonds that is additionally enlarged by one residue (Leu43), and (iii) a three-residue insertion at the C-terminal extended linker region that supports a unique α-helix (3) fold (residues 87 to 91) by establishing seven non-conserved H-bonds (fig. S5B and table S4). A three-residue insertion was predicted by sequence comparison between Asn75 and Ile79 of the Tb propeptide (fig. S5A). However, this sequence comprises the major part of the six residues absent in our SFX model.

S4. Propeptide interaction with the mature enzyme domain

The tight interaction between TbCatB and its propeptide is based on a multi-epitope complex interface made of hydrophobic and polar contacts that covers a total interface area of 2,150 Å2, representing 36 % and 17 % of the accessible surface of the propeptide and the mature enzyme, respectively. The major epitopes within the enzyme structure, comprising the exosite of the R domain (Glu262 to Gly276) as well as the active site cleft jointly assembled by residues of the L (Gln116 to Trp123 and Cys162 to Arg169) and the R domain (Leu279 to Ala283), remain largely conserved compared to mammalian procathepsin B (13, 38) (fig. S9). Since main-chain atoms of the propeptide residues are preferentially involved in the polar interactions, only a few differences in the otherwise homologous hydrogen bond pattern result from individual amino acid substitutions between both species (fig. S7A and table S5).

The N-terminal part of the T. brucei propeptide (residues 26 to 31) does not contribute to the complex interaction. At the subsequent exosite (fig. S7B,C), the termini of the conserved antiparallel β-strand interface along Ile47 to Lys51 are less stabilized in the TbCatB complex compared to human procathepsin B. The hydrophobic side chains of Ile47 and Val55 in the trypanosomal propeptide prevent the formation of five H-bonds that are established by polar residues at the equivalent positions in the human structure (13), without affecting the comparable hydrophobic contacts. Following a wide kink, the peptide chain enters the prime subsite of the active site cleft that is significantly narrowed in TbCatB due to an altered “open” conformation of the occluding loop (see below). This displaces the T. brucei propeptide by approximately 4 Å into a raised position that is closer to the surface of the enzyme’s R domain compared to the human propeptide (fig. S6), well anchored in the hydrophobic environment by strong hydrophobic contacts of Met56, Ile59, and Leu61 (fig. S7A). However, the shift also weakens the T. brucei propeptide interaction with the occluding loop of the enzyme’s L domain that forms the

Page 6: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

6

opposite rim of the prime subsite crevice, including the loss of two H-bonds (Asp60-Cys205 and Tyr63-Cys 192) that are observed in human procathepsin B (13).

Within the α-helix 2 motif that shields the S2′ and S1′ binding sites, the trypanosomal propeptide converges again with the human chain, running in a superimposable conformation across the active site (S1) of the enzyme (fig. S6). Thus, the inhibitory interactions of the T. brucei propeptide closely correspond to that described for human procathepsin B (13), depending on the peptide coordination in a reverse direction compared to the substrate that prevents a carbonyl group to enter the catalytic oxyanion hole. The active site interaction network is destabilized by two polar contacts in the TbCatB complex, since Thr70, which was considered to significantly contribute to the tightness of the human propeptide interaction by establishing two H-bonds via its side chain hydroxyl group with Cys122 and Ala283 of the enzyme (13), is substituted by a valine residue with a hydrophobic side chain. However, Asn68 of the T. brucei propeptide forms one additional H-bond to Leu279 along with improved hydrophobic contacts, which restores at least the strength of the interaction with the enzyme’s R domain.

Before entering the S2 subsite, the trypanosomal propeptide leaves the active site cleft and is deviated to the enzyme surface where the proteolytic maturation of the peptide occurred. At Val70, the main chain progression already starts to alter from that of the corresponding human propeptide, exhibiting a shift of 4.5 Å for the Cα atom of Lys72, the last residue that is defined by electron density at the C-terminus of the propeptide. However, these differences are most likely attributed to the increased flexibility of the C-terminal propeptide residues after proteolytic maturation than to an altered binding mode of the T. brucei propeptide. The S2 subsite of TbCatB is known to be enlarged due to the presence of aspartate and glycine at positions 166 and 328, replacing tyrosine and glutamate residues that define the steric and electrostatic constraints for P2 substituents in human CatB (10, 11). Moreover, an H-bond established by the side chain of Glu324 that anchors the human propeptide deeper in the S2 subsite is not conserved in TbCatB.

Starting at Ile79, the part of the T. brucei propeptide that remained covalently bound to the TbCatB enzyme domain runs in an extended conformation along the surface of the enzyme’s R domain, as observed for the human propeptide (13) (fig. S7D,E). Only two H-bonds are conserved between both species, but the overall number of contacts established by the individual side chains remained similar (table S5). A strong polar interaction between Arg83 and Glu89 anchored at a small surface cavity by a hydrophobic contact of Phe85 raises the main chain atoms of the T. brucei propeptide away from the CatB surface to form a third short α-helix.

Taken together, the TbCatB-propeptide complex interface is enlarged by 310 Å2 compared to that of human procathepsin B (13), although less enzyme residues are involved (72 in TbCatB, 79 in human CatB). This gain in hydrophobic contacts partly compensates the loss of eight polar interactions for interface stabilization.

S5. The occluding loop

The occluding loop comprises residues Phe189 to Pro213 of the L domain in TbCatB. As observed for mammalian CatB (39), the highly conserved residues His194 and His195 span parts of the prime subsite of the substrate binding cleft in mature

Page 7: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

7

TbCatB (Fig. 3A), representing the “closed” conformation that is stabilized by five polar interdomain contacts including the critical His194 – Asp115 ion pair that dictates the pH dependent loop flexibility (28).

Competing for the binding site, the TbCatB propeptide raises the highly flexible loop out of the prime site into an “open” conformation (Fig. 3B). Displacements of more than 2 Å (Cα atoms), which are considered to be significant based on the estimated position displacement errors for individual residues that have been calculated from the diffraction precision index (fig. S10), are restricted to the four residues His194, His195, Ser196 and Lys197, while the remaining part of the loop is only slightly affected, including the C-terminal “FNFD” motif that was suggested to be stabilized by stacking interactions of the Phe residues (10). The largest movement of approximately 14 Å was detected for the side chain of His195, similar to the transition of the human loop. However, the highly mobile segment of the TbCatB occluding loop is much smaller than that of human CatB, which exhibits large structural differences for the entire 11 residues between the conserved disulfide bridge that links Cys187 and Cys198 (Fig. 3B) (13). As a prerequisite for the displacement, the salt bridge His194 – Asp115 that is suggested to be crucial for the stabilization of the “closed” loop conformation is consistently disrupted in T. brucei and human CatB, but three maintained H-bonds keep the residues directly flanking the mobile segment of the TbCatB loop, Cys192 and Ser198, at an almost fixed position during transition, acting as a hinge. A small shift of the Glu307 side chain further strengthens the interaction of the “open” loop with the R domain by formation of an additional H-bond to Ser193, while the polar contact to Asn200 is weakened, but not completely lost. The latter interaction is suggested to compensate for the Arg195 - Asp303 salt bridge (human CatB numbering) that is highly conserved for the “closed” conformation of mammalian CatB (8, 39, 40), but is absent in the trypanosomal enzyme. In contrast, only one of five polar contacts is maintained at the N-terminus of the flexible loop region (Cys187) in human CatB, significantly enhancing the degree of freedom during transition for the entire segment.

As a consequence of the enhanced rigidity of the TbCatB occluding loop, the cleft that becomes exposed at the prime subsite upon movement of the small mobile segment is more narrow than that of human CatB in the “open” conformation. This is mainly due to the small displacement of His194 (4.1 Å for Cα), which remained located within the substrate cleft (Fig. 3B). At the closest point, the side chains of Ile264 (R domain, Cγ2 atom) and His194 (loop, Nε atom) are separated by only 8.5 Å in TbCatB. In human CatB, both critical His residues are moved out of the cleft, exposing the entire crevice restricted by Leu261 (R domain, Cβ atom) and Lys198 (loop, Cβ atom). Another consequence of the narrow prime subsite cleft is the raised position of the bound propeptide described above.

S6. Impact of TbCatB glycosylation

Glycosylation has been identified as a crucial posttranslational modification of mammalian CatB that triggers zymogene targeting and processing in vivo (41, 42), while the impact on the enzyme activity has been suggested to be insignificant (29). We identified two N-linked oligosaccharide substitutions in the electron density of the TbCatB-propeptide complex that were not present in the search model (Fig. 2C,D). The beginning of an N-glycan core structure comprising two N-acetylglucosamin (NAG) and

Page 8: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

8

one mannose (MAN) residue is linked to Asn216 of the enzyme (Fig. 4A). This residue downstream of the C-terminus of the occluding loop represents the known glycosylation site of mature TbCatB (11). The low flexibility of these carbohydrate molecules, which are well-defined by the electron density map, results from interactions with the enzyme surface that align the carbohydrate parallel to the occluding loop (residues Pro188 to Pro190). The interface area covering 263 Å2 is stabilized by two direct and one water-bridged polar contact involving the main-chain atoms of Pro188 and Lys214. Thus, the carbohydrate links both strands of the occluding loop, supporting the observed rigidity of the N- and C-terminal loop residues during conformational transition. Further carbohydrate residues were not defined by the electron density map.

Interestingly, the TbCatB enzyme glycosylation site is not conserved in human CatB (fig. S9). N-linked oligosaccharide substitution was predicted at Asn192, representing the highly flexible tip of the human occluding loop that points out straight into the solvent. This exterior localization indicates a minor influence of the human carbohydrate moiety on the conformational loop stability, already supported by activity measurements (29). However, the absence of high-resolution data characterizing the human N-glycan structure prevents a confirmation by structural analysis so far.

This lack of data also applies for a glycosylation consensus sequence identified at Asn44 of the human propeptide. Due to the remote localization at the C-terminus of the helix1 motif, direct contacts to the mature enzyme surface appeared to be unfavorable, excluding an impact of the carbohydrate moiety on the propeptide affinity. In contrast, the carbohydrate residues of the TbCatB propeptide anchored at residue Asn58 at the kink of the peptide segment directly interacts with the tip of the open occluding loop (Fig. 4B). Three hydrogen bonds connect the primary NAG unit with Gln57 of the propeptide chain and Ser196 of the occluding loop, forming a contact interface of 114 Å2. Considering that the propeptide establishes no further direct polar contact with the occluding loop as discussed above, the carbohydrate moiety significantly contributes to the stabilization of the open loop conformation in TbCatB, additionally supporting the propeptide binding. However, the electron density is impaired for the second NAG unit that exhibits no interaction with the mature enzyme, preventing further analysis of this oligosaccharide chain.

The TbCatB propeptide exhibits an additional, non-conserved consensus motif for N-linked oligosaccharide substitution at Asn76. This residue, located at the propeptide insertion of TbCatB that spans the S2 subside, is not defined in our SFX structure due a high conformational flexibility. Thus, the relevance of this additional carbohydrate chain for the propeptide stabilization remains to be elucidated.

Page 9: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

9

Fig. S1. Size distribution of in vivo grown TbCatB crystals injected into the FEL beam. Size distributions were determined from scanning electron micrographs of extracted crystals that have been crushed prior to measurements (11). The width (A) and length (B) of 300 crystals were measured, resulting in an average width of 0.9 µm and an average length of 11 µm. 55% of the crystals are shorter than 10 µm. (C) The average crystal volume is calculated to be 10.9 µm³. 38% of the crystals have a volume smaller than 5 µm³ and 65% are smaller than 10 µm³. Error bars represent standard deviations.

Page 10: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

10

Fig. S2 Experimental setup for SFX at the CXI beamline of LCLS. At room temperature, a flowing suspension of in vivo grown TbCatB micro-crystals was produced in vacuum as a liquid jet, formed by gas focusing. The liquid column was 4 µm in diameter and flowed at 10 µl min-1. Single-crystal diffraction data were recorded on a CSPAD detector operating at the 120 Hz repetition rate of the X-ray pulses. The crystals remained fully hydrated in the interaction region.

Page 11: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

11

Fig. S3 Typical single-pulse X-ray diffraction pattern from an individual TbCatB in vivo crystal after background subtraction. Using a single X-ray pulse of approximately 40 fs duration, Bragg peaks were recorded on a CSPAD to better than 2 Å resolution. The inset shows a magnification of an individual high-resolution Bragg reflection. Background correction was performed as described in Material and Methods.

Page 12: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

12

Fig. S4 Quality measures of the TbCatB SFX diffraction dataset. Plots of (A) Rsplit as a function of the number of indexed diffraction patterns and (B) as a function of the resolution. The graphs were calculated for a resolution range of 20 Å to 2.1 Å. The overall Rsplit value was calculated to be 10.2 %. (C) Wilson plot created with the program CrystFEL (23) for resolution range of 20 Å to 2.0 Å.

Page 13: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

13

Fig. S5 Internal stabilization of T. brucei and human CatB propeptides. (A) Scheme illustrating the sequence identity and the internal interaction network that stabilizes the human (upper part) and T. brucei (lower part) CatB propeptides. Identical residues are highlighted in red, homologous residues in yellow. Grey residues (73 to 78) are missing in the T. brucei propeptide structure due to proteolytic cleavage between residues 78 and 79 that gives rise in the conformational flexibility of the previous residues. Red and black clamps connecting the involved amino acids symbolize conserved and non-conserved interactions, respectively. Main chain donor/acceptor atoms are indicated by black clamp branches, while grey branches depict the involvement of side chain atoms. Displayed

Page 14: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

14

numbers represent the corresponding bond length (Å). (B) Superposition of T. brucei (green) and human (orange) propeptide structures. Regions folded into a well-defined secondary structure show a high degree of structural conservation. Residues spanning the active site of the enzyme are highlighted (red circle). Dashed lines indicate interactions that are not conserved between both propeptides, with the corresponding bond lengths (Å) displayed as numbers. A detailed comparison of the intramolecular interactions between the T. brucei and human CatB propeptides is presented in table S4.

Page 15: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

15

Fig. S6 Active site inhibition of T. brucei and human CatB by their propeptides. Superposition of the T. brucei (orange) and the human (green) propeptides bound to the active site of CatB. Due to a slight displacement compared to the human propeptide, the T. brucei propeptide looses two polar contacts with the enzyme’s L domain at the α-helix 2 motif. However, the almost superimposable conformation of both propeptides at the active site of the respective enzyme demonstrates a high degree of conservation for the enzyme inhibition. All residues are labeled according to the TbCatB numbering, with the corresponding human CatB numbering in parenthesis. Enzyme residues involved in direct interactions with the propeptides are displayed in light blue (TbCatB) and grey (human CatB) and labeled in italics, residues of the catalytic triad that assemble the active site of CatB are labeled in red. Hydrogen bonds are symbolized as dashed lines according to the color code of the propeptides, with the corresponding bond lengths (Å) displayed as numbers. The carbonyl group of Asn68 (Cys42) does not point into the oxyanion hole, as observed for bound substrates, and therefore prevents proteolytic cleavage. Differences in the interaction pattern between the T. brucei and the human propeptide at the active site are restricted to Val70 (Thr44), where the side chain of threonine in the human propeptide establishes two additional hydrogen bonds compared to the valine residue in the trypanosomal propeptide. In contrast, the asparagine side chain of the T. brucei propeptide at N68 (C42) forms a unique H-bond with Leu279 of the enzyme’s R domain.

Page 16: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

16

Fig. S7 Intermolecular interactions of T. brucei and human propeptides with the mature CatB enzyme domain. (A) Scheme illustrating hydrophobic and polar interactions of the human (upper part) and T. brucei (lower part) CatB propeptides with the corresponding enzyme domain. Identical residues are highlighted in red, homologous residues in yellow, grey

a

Page 17: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

17

residues (73 to 78) are missing in the T. brucei propeptide structure due to an increased flexibility after proteolytic cleavage. Blue bars next to the sequences represent the buried surface area (% BSA) of each residue as a result of complex formation. The red and black dashed lines symbolize conserved and non-conserved hydrogen bonds with the corresponding bond length (Å), respectively, connecting the displayed enzyme and propeptide residues. (B to E), interaction of T. brucei (green) and human (orange) propeptides with the mature enzyme domains (grey) at the α-helix1/β-strand epitope (B, C) and the extended loop epitope (D, E). Hydrogen bonds are symbolized as dashed lines according to the color code of the propeptides, bond lengths are displayed as numbers (Å). Red circles highlight epitope regions that exhibit significant differences in the interaction pattern (B, C) or conserved interactions (D, E). A detailed comparison of the intermolecular interactions between the T. brucei and human CatB propeptides with the corresponding enzyme domains is presented in table S5.

Page 18: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

18

Fig. S8 Structural homology of the TbCatB-propeptide complex and mature TbCatB. The Cα plot of the TbCatB-propeptide complex (grey) is superimposed on that of mature TbCatB (blue). The average RMS deviation between the enzyme structures of the TbCatB-propeptide complex determined by SFX at room temperature and refined to a resolution of 2.1 Å and of mature TbCatB (Protein Data Bank ID, 3MOR) (11) determined at 100 K by conventional X-ray crystallography and refined to 2.55 Å resolution is 0.42 Å for equivalent Cα atoms. The occluding loop region (residues 189 to 213) shows the major structural differences between the two structures; it is highlighted by the red box. The subsites (S2′ to S3) of the substrate binding cleft are labeled according to Schechter and Berger (1967) (24). The propeptide (residues 26 to 93) of the TbCatB-propeptide complex is displayed in green.

Page 19: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

19

Fig. S9 Sequence alignment of T. brucei and human pre-procathepsin B. Identical amino acids are highlighted in red, homologous residues in yellow. Residues involved in the assembly of the prime site of the substrate binding cleft are indicated by a green bar, confirmed (TbCatB) and predicted (human CatB) N-linked glycosylation sites are marked with blue asterisks. The signal peptide and the propeptide as well as the occluding loop were assigned based on the human pre-proCatB sequence (13).

Page 20: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

20

Fig. S10 Estimated position displacement error for individual residues. Values in the plot have been calculated from the diffraction precision index (σr

DPI(Bavg)) multiplied by residues average B-factor (Bi) divided by average B-factor for all residues (Bavg).

Page 21: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

21

Table S1. Electron and photon beam parameters for the SFX experiment of TbCatB in vivo crystals at the CXI instrument (LCLS)

Parameter  Mean  Standard deviationElectron energy* [MeV]  14541.9  11.8 Number of electrons*  9e+08  1.5e+07 Charge* [nC]  0.149  0.002 Electron pulse duration* [fs]  42  2.5 Photon pulse energy at sample* [mJ]  0.59  0.05 Photon energy* [eV]  9379  16 Photon wavelength* [Å]  1.32  0.002 Peak X‐ray power* [GW]  54  ‐‐ Power density at the sample** [Wcm‐2]  0.8e+17  ‐‐ 

* Varies from shot to shot. ** Focus 10 µm2, 15% beamline transmission included.

Page 22: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

22

Table S2. SFX data collection and refinement statistics Data collection Wavelength (Å)  1.32 Maximum dose per crystal (MGy)  31 Space group  P42212 Cell dimensions a, b, c (Å)  125.4 (0.6)a, 125.4 (0.7)a, 54.6 (0.2)a

VM (ų/Da) / solvent content (%)  3.2 / 61 Number of crystal hits  293,195 Number of indexed patterns (% of hits)  178,875 (61) Number of unique reflections  25,969 Resolution (Å)   20 – 2.1 (2.175 – 2.1) Completeness (%)  100 (100) I/σ (I)  11.92 (2.37) Rsplit

b  0.10 (0.35) Redundancy  7,807 (7,060) Refinement No. reflections used in refinement  24,648 No. reflections used for Rfree  1,321 Rwork / Rfree  0.181 / 0.214 No. atoms     Protein   2,386   Carbohydrate   67   Water   98 B‐factors (Å2)     Wilson B factor  49.0    Protein (main chain / side chain)  46.1 / 47.7   Carbohydrate  65.7   Water  50.0 Diffraction precision index (DPI) (Å)  0.146 R.m.s. deviations     Bond lengths (Å)  0.013   Bond angles (°)  1.578   Av. r.m.s. B‐factor main/side chain atoms (Å2) 1.491 / 1.937 Ramachandran plot (%)     Most favored  96.7   Allowed  3.3   Disallowed  0 

Page 23: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

23

a) Values in brackets represent standard deviations determined by analyzing 178,875 crystals b) Rsplit as defined in White et al. (2012) (23):

Numbers in parenthesis refer to the highest resolution shell.

Page 24: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

24

Table S3. Quality indicators for the individual resolution shells of the dataset

Resolution shell [Å] 

Number of  unique reflections

Redundancy 

Merged I/sigma(I) 

Rsplit [%] 

20.000 – 4.509  2,793  7,541  32.91  3.0 4.509 – 3.585  2,648  8,094  27.57  3.3 3.585 – 3.134  2,609  8,353  19.12  3.5 3.134 – 2.848  2,588  7,656  12.04  5.8 2.848 – 2.645  2,588  7,656  8.22  11.5 2.645 – 2.489  2,568  7,968  5.93  16.2 2.489 – 2.365  2,552  8,212  4.55  19.6 2.365 – 2.262  2,560  7,899  3.59  24.4 2.262 – 2.175  2,536  7,505  2.93  28.5 2.175 – 2.100  2,540  7,060  2.37  35.3 

Page 25: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

25

Table S4. Internal hydrogen bonds and ionic interactions that stabilize the structure of T. brucei and human CatB propeptides

TbCatB propeptide  human CatB propeptide (13) 

Donor  Acceptor  Distance (Å)  Donor  Acceptor  Distance (Å) ‐  ‐  ‐  ARG 23 NE  ASP 60 OD1  3.5 ‐  ‐  ‐  ARG 25 NH1  PRO 26 O  3.0 ‐  ‐  ‐  HIS 29 ND1  PHE 28 O  2.6 ALA 29 N  ALA 26 O  2.8  ‐  ‐  ‐ VAL 31 N  ILE 59 O  3.1  ‐  ‐  ‐ PHE 36 N  SER 33 OG  3.3  LEU 36 N  SER 33 OG  3.4 VAL 37 N  SER 33 O  3.1  VAL 37 N  SER 33 O  3.3 ASP 38 N  LYS 34 O  2.9  ASN 38 N  ASP 34 O  2.6 ‐  ‐  ‐  ASN 38 ND2  ASP 34 OD2  2.9 ARG 39 N  ALA 35 O  3.0  TYR 39 N  GLU 35 O  2.8 VAL 40 N  PHE 36 O  3.0  VAL 40 N  LEU 36 O  3.0 ASN 41 N  VAL 37 O  3.0  ASN 41 N  VAL 37 O  3.1 ASN 41 ND2  ALA 50 O  2.7  ASN 41 ND2  ALA 50 O  3.0 ARG 42 N  ASP 38 O  3.0  LYS 42 N  ASN 38 O  2.7 LEU 43 N  ARG 39 O  2.9  ARG 43 N  TYR 39 O  2.9 ‐  ‐  ‐  ARG 43 NE  TYR 39 OH  3.0 ASN 44 N  VAL 40 O  2.8  ASN 44 N  ASN 41 O  3.2 ASN 44 ND2  ASN 41 OD1  3.2  ‐  ‐  ‐ ARG 45 N  ARG 42 O  3.3  ‐  ‐  ‐ GLY 46 N  ASN 44 OD1  3.0  ‐  ‐  ‐ ILE 47 N  ASN 44 O  3.4  ‐  ‐  ‐ ALA 50 N  ASN 41 OD1  2.9  ALA 50 N  ASN 41 OD1  2.8 LYS 51 NZ  ASP 53 OD1  3.4  ‐  ‐  ‐ ‐  ‐  ‐  HIS 52 NE2  LEU 32 O  2.6 GLN 57 N  GLY 54 O  3.0  ‐  ‐  ‐ ‐  ‐  ‐  VAL 59 N  TYR 57 O  2.8 ILE 59 N  MET 56 O  3.0  ‐  ‐  ‐ THR 60 N  GLU 63 OE1  3.0  ‐  ‐  ‐ LEU 61 N  ALA 29 O  3.0  ‐  ‐  ‐ ARG 62 N  GLU 27 O  2.9  ‐  ‐  ‐ ALA 64 N  THR 60 O  3.0  LEU 64 N  ASP 60 O  3.4 LYS 65 N  LEU 61 O  3.0  LYS 65 N  MET 61 O  3.4 LYS 65 NZ  GLU 27 OE2  3.4  ‐  ‐  ‐ ARG 66 N  ARG 62 O  3.1  ARG 66 N  SER 62 O  3.1 LEU 67 N  GLU 63 O  3.3  LEU 67 N  TYR 63 O  3.1 ASN 68 N  LYS 65 O  3.2  CYS 68 N  LYS 65 O  3.1 ARG 83 NH1  ARG 84 O  3.0  ‐  ‐  ‐ ARG 83 NH1  GLU 89 OE2  2.9  ‐  ‐  ‐ 

Page 26: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

26

THR 86 N  GLU 89 OE2  2.9  ‐  ‐  ‐ GLU 89 N  THR 86 OG1  3.1  ‐  ‐  ‐ ALA 90 N  THR 86 O  2.9  ‐  ‐  ‐ ARG 91 N  GLU 87 O  3.2  ‐  ‐  ‐ ALA 92 N  GLU 89 O  3.4  ‐  ‐  ‐ ‐  ‐  ‐  LEU 92 N  GLU 88 O  2.6 

Interactions were detected using the PISA server (35) at the European Bioinformatics Institute. Residues of the human CatB propeptide are labelled according to the TbCatB propeptide numbering to allow the direct comparison.

Page 27: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

27

Table S5. Hydrogen bonds and ionic interactions between the T. brucei and the human propeptides and the corresponding CatB enzymes

TbCatB‐propeptide complex  human proCatB (pdb code 3PBH) (13) 

Pro‐residue  CatB residue  Distance (Å)  Pro‐residue  CatB residue  Distance (Å) ILE 47 O  VAL 270 N  3.2  THR 47 O  VAL 270 N  2.7 ‐  ‐  ‐  THR 47 OG1  GLN 272 NE2  2.9 ‐  ‐  ‐  THR 47 OG1  VAL 270 O  2.8 TRP 48 NE1  HIS 272 O  3.0  TRP 48 NE1  GLN 272 O  3.0 LYS 49 N  SER 268 O  2.9  GLN 49 N  SER 268 O  3.2 LYS 49 O  ASN 267 N  2.8  GLN 49 O  LYS 267 N  3.3 LYS 51 N  ALA 265 O  3.1  GLY 51 N  LEU 265 O  2.9 ‐  ‐  ‐  ASN 53 ND2  LEU 264 O  2.7 ‐  ‐  ‐  ASN 53 ND2  TYR 266 O  2.8 ‐  ‐  ‐  ASN 53 OD1  GLY 309 N  3.0 ‐  ‐  ‐  ASP 60 N  CYS 205 O  3.7 ‐  ‐  ‐  TYR 63 OH  CYS 192 O  3.2 ARG 66 NH1  ALA 118 O  3.3  ARG 66 NE  SER 118 O  3.4 ARG 66 NH2  ASN 163 OD1  3.1  ARG 66 NH2  ASN 163 OD1  3.1 ARG 66 NH1  PHE 210 O  2.9  ARG 66 NH1  GLY 210 O  2.6 LEU 67 O  GLN 116 NE2  2.9  LEU 67 O  GLN 116 NE2  3.0 LEU 67 O  HIS 282 NE2  3.5  LEU 67 O  HIS 282 NE2  2.5 ASN 68 ND2  LEU 279 O  3.6  ‐  ‐  ‐ GLY 69 N  CYS 122 SG  3.2  GLY 69 N  CYS 122 SG  3.2 GLY 69 O  GLY 165 N  2.9  GLY 69 O  GLY 165 N  3.3 VAL 70 N  GLY 281 O  3.0  THR 70 N  GLY 281 O  3.2 ‐  ‐  ‐  THR 70 OG1  CYS 122 SG  3.3 ‐  ‐  ‐  THR 70 OG1  ALA 283 N  3.0 ILE 71 N  GLY 165 O  3.0  PHE 71 N  GLY 165 O  3.1 ‐  ‐  ‐  GLY 73 N  GLU 328 OE1  3.2 PRO 81 O  ALA 236 N  2.8  GLN 81 O  SER 236 N  3.2       GLN 81 NE2  SER 234 O  3.0 LYS 82 NZ  ASP 242 OD2  3.6       ‐  ‐  ‐  ARG 82 NH2  SER 238 O  3.2 ARG 83 N  ALA 236 O  3.0  VAL 83 N  SER 236 O  2.9 ARG 83 NE  GLU 246 OE1  2.8  ‐  ‐  ‐ ‐  ‐  ‐  PHE 85 N  ASP 242 OD1  2.9 GLU 89 O  ARG 245 NH1  2.9  ‐  ‐  ‐ ALA 92 O  ARG 245 NH1  3.0  ‐  ‐  ‐ 

Interactions were detected using the PISA server (35) at the European Bioinformatics Institute. Residues of human CatB and its propeptide are labelled according to the TbCatB numbering to allow a direct comparison.

Page 28: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

28

References and Notes

1. J. A. Frearson et al., N-Myristoyltransferase inhibitors as new leads to treat sleeping sickness. Nature 464, 728 (2010).

2. M. P. Barrett, D. W. Boykin, R. Brun, R. R. Tidwell, Human African trypanosomiasis: Pharmacological re-engagement with a neglected disease. Br. J. Pharmacol. 152, 1155 (2007).

3. S. Alsford et al., High-throughput decoding of antitrypanosomal drug efficacy and resistance. Nature 482, 232 (2012).

4. A. H. Fairlamb, Chemotherapy of human African trypanosomiasis: Current and future prospects. Trends Parasitol. 19, 488 (2003).

5. C. Bryant et al., Novel non-peptidic vinylsulfones targeting the S2 and S3 subsites of parasite cysteine proteases. Bioorg. Med. Chem. Lett. 19, 6218 (2009).

6. M. H. Abdulla et al., RNA interference of Trypanosoma brucei cathepsin B and L affects disease progression in a mouse model. PLoS Negl. Trop. Dis. 2, e298 (2008).

7. F. Lecaille, J. Kaleta, D. Brömme, Human and parasitic papain-like cysteine proteases: Their role in physiology and pathology and recent developments in inhibitor design. Chem. Rev. 102, 4459 (2002).

8. Y. Yamamoto, M. Kurata, S. Watabe, R. Murakami, S. Y. Takahashi, Novel cysteine proteinase inhibitors homologous to the proregions of cysteine proteinases. Curr. Protein Pept. Sci. 3, 231 (2002).

9. G. Dubin, Proteinaceous cysteine protease inhibitors. Cell. Mol. Life Sci. 62, 653 (2005).

10. I. D. Kerr, P. Wu, R. Marion-Tsukamaki, Z. B. Mackey, L. S. Brinen, Crystal Structures of TbCatB and rhodesain, potential chemotherapeutic targets and major cysteine proteases of Trypanosoma brucei. PLoS Negl. Trop. Dis. 4, e701 (2010).

11. R. Koopmann et al., In vivo protein crystallization opens new routes in structural biology. Nat. Methods 9, 259 (2012).

12. K. Tomoo, Development of cathepsin inhibitors and structure-based design of cathepsin B-specific inhibitor. Curr. Top. Med. Chem. 10, 696 (2010).

13. M. Podobnik, R. Kuhelj, V. Turk, D. Turk, Crystal structure of the wild-type human procathepsin B at 2.5 A resolution reveals the native active site of a papain-like cysteine protease zymogen. J. Mol. Biol. 271, 774 (1997).

14. R. M. Bill et al., Overcoming barriers to membrane protein structure determination. Nat. Biotechnol. 29, 335 (2011).

15. R. J. Southworth-Davies, M. A. Medina, I. Carmichael, E. F. Garman, Observation of decreased radiation damage at higher dose rates in room temperature protein crystallography. Structure 15, 1531 (2007).

Page 29: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

29

16. C. Riekel, Recent developments in micro-diffraction on protein crystals. J. Synchrotron Radiat. 11, 4 (2004).

17. H. N. Chapman et al., Femtosecond X-ray protein nanocrystallography. Nature 470, 73 (2011).

18. A. Barty et al., Self-terminating diffraction gates femtosecond X-ray nanocrystallography measurements. Nat. Photonics 6, 35 (2011).

19. S. Boutet, G. J. Williams, The Coherent X-ray Imaging (CXI) instrument at the Linac Coherent Light Source (LCLS). New J. Phys. 12, 035024 (2010).

20. S. Boutet et al., High-resolution protein structure determination by serial femtosecond crystallography. Science 337, 362 (2012).

21. U. Weierstall, J. C. Spence, R. B. Doak, Injector for scattering measurements on fully solvated biospecies. Rev. Sci. Instrum. 83, 035108 (2012).

22. R. A. Kirian et al., Femtosecond protein nanocrystallography-data analysis methods. Opt. Express 18, 5713 (2010).

23. T. A. White et al., CrystFEL: A software suite for snapshot serial crystallography. J. Appl. Cryst. 45, 335 (2012).

24. M. E. McGrath, The lysosomal cysteine proteases. Annu. Rev. Biophys. Biomol. Struct. 28, 181 (1999).

25. J. R. Chagas, M. Ferrer-Di Martino, F. Gauthier, G. Lalmanach, Inhibition of cathepsin B by its propeptide: Use of overlapping peptides to identify a critical segment. FEBS Lett. 392, 233 (1996).

26. C. Illy et al., Role of the occluding loop in cathepsin B activity. J. Biol. Chem. 272, 1197 (1997).

27. I. Schechter, A. Berger, On the size of the active site in proteases. I. Papain. Biochem. Biophys. Res. Commun. 27, 157 (1967).

28. O. Quraishi et al., The occluding loop in cathepsin B defines the pH dependence of inhibition by its propeptide. Biochemistry 38, 5017 (1999).

29. Y. Chen, C. Plouffe, R. Ménard, A. C. Storer, Delineating functionally important regions and residues in the cathepsin B propeptide for inhibitory activity. FEBS Lett. 393, 24 (1996).

30. R. Emma et al., First lasing and operation of an ångstrom-wavelength free-electron laser. Nat. Photonics 4, 641 (2010).

31. L. Lomb et al., An anti-settling sample delivery instrument for serial femtosecond crystallography. J. Appl. Cryst. 45, 674 (2012).

32. A. Vagin, A. Teplyakov, MOLREP: An Automated Program for Molecular Replacement. J. Appl. Cryst. 30, 1022 (1997).

33. P. Emsley, K. Cowtan, Coot: Model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126 (2004).

Page 30: Supplementary Materials for · hydrogen citrate. Data acquisition was performed in positive ion and linear mode (ion source 1: 25 kV, ion source 2: 23.7 kV, lens: 7 kV, laser power:

30

34. G. N. Murshudov, A. A. Vagin, E. J. Dodson, Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D Biol.Crystallogr. 53, 240 (1997).

35. E. Krissinel, K. Henrick, Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774 (2007).

36. O. Quraishi, A. C. Storer, Identification of internal autoproteolytic cleavage sites within the prosegments of recombinant procathepsin B and procathepsin S. Contribution of a plausible unimolecular autoproteolytic event for the processing of zymogens belonging to the papain family. J. Biol. Chem. 276, 8118 (2001).

37. Z. B. Mackey, T. C. O’Brien, D. C. Greenbaum, R. B. Blank, J. H. McKerrow, A cathepsin B-like protease is required for host protein degradation in Trypanosoma brucei. J. Biol. Chem. 279, 48426 (2004).

38. M. Cygler et al., Structure of rat procathepsin B: Model for inhibition of cysteine protease activity by the proregion. Structure 4, 405 (1996).

39. D. Musil et al., The refined 2.15 A X-ray crystal structure of human liver cathepsin B: The structural basis for its specificity. EMBO J. 10, 2321 (1991).

40. Z. Jia et al., Crystal structures of recombinant rat cathepsin B and a cathepsin B-inhibitor complex. Implications for structure-based inhibitor design. J. Biol. Chem. 270, 5527 (1995).

41. S. Hasnain, T. Hirama, A. Tam, J. S. Mort, Characterization of recombinant rat cathepsin B and nonglycosylated mutants expressed in yeast. New insights into the pH dependence of cathepsin B-catalyzed hydrolyses. J. Biol. Chem. 267, 4713 (1992).

42. P. M. Steed et al., Characterization of recombinant human cathepsin B expressed at high levels in baculovirus. Protein Sci. 7, 2033 (1998).